LeetCode 1143 - Longest Common Subsequence
The problem asks us to find the length of the longest subsequence that appears in both input strings. A subsequence is formed by deleting some characters from a string while keeping the remaining characters in the same relative order. The characters do not need to be contiguous.
Difficulty: 🟡 Medium
Topics: String, Dynamic Programming
Solution
Problem Understanding
The problem asks us to find the length of the longest subsequence that appears in both input strings.
A subsequence is formed by deleting some characters from a string while keeping the remaining characters in the same relative order. The characters do not need to be contiguous.
For example, in the string "abcde":
"ace"is a valid subsequence because we can remove'b'and'd'"aec"is not a valid subsequence because the relative order changes
We are given two strings, text1 and text2, and we need to determine the maximum possible length of a subsequence that exists in both strings.
The key detail is that we only need the length of the longest common subsequence, not the subsequence itself.
For example:
text1 = "abcde"text2 = "ace"
The longest common subsequence is "ace", which has length 3.
The constraints are important:
- Each string length can be up to
1000 - Only lowercase English letters are used
A naive brute-force solution would be far too slow because the number of subsequences grows exponentially. With strings of length 1000, we need a polynomial-time algorithm.
Several edge cases are important:
- The two strings may have no common characters at all
- One string may be entirely contained inside the other
- Multiple different subsequences may have the same maximum length
- Repeated characters can create many overlapping subproblems
- The optimal subsequence may skip many characters in both strings
These observations strongly suggest that dynamic programming is the correct approach because the problem contains overlapping subproblems and optimal substructure.
Approaches
Brute Force Approach
The brute-force approach would generate every possible subsequence of one string and check whether it is also a subsequence of the other string.
A string of length n has 2^n subsequences because each character can either be included or excluded.
For every generated subsequence, we could check whether it appears in the second string while preserving order.
This approach is correct because it exhaustively examines every possible subsequence and therefore cannot miss the optimal answer.
However, it is computationally infeasible for the given constraints. Even a string of length 50 already produces more than one quadrillion subsequences. Since the input size can reach 1000, brute force is impossible.
Optimal Dynamic Programming Approach
The key insight is that the problem has overlapping subproblems.
Suppose we are comparing:
text1[i:]text2[j:]
The answer for these suffixes depends only on smaller suffixes.
There are two possibilities:
- The current characters match
- The current characters do not match
If the characters match, we can include that character in the subsequence and continue solving the remaining suffixes.
If they do not match, we must skip one character from either string and take the better result.
This naturally leads to a dynamic programming recurrence.
Define:
dp[i][j] = length of the longest common subsequence between:
text1[0:i]text2[0:j]
Transition rules:
- If characters match:
dp[i][j] = dp[i-1][j-1] + 1
- Otherwise:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
This reduces the problem from exponential time to polynomial time.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n * m) | O(n) | Generates all subsequences and checks validity |
| Optimal Dynamic Programming | O(n * m) | O(n * m) | Uses overlapping subproblem results efficiently |
Algorithm Walkthrough
- Let
nbe the length oftext1andmbe the length oftext2. - Create a 2D dynamic programming table called
dpwith dimensions(n + 1) x (m + 1).
The extra row and column represent empty prefixes. This simplifies boundary handling because an empty string has LCS length 0 with any string.
3. Initialize every cell to 0.
Initially, we assume no common subsequence exists. 4. Iterate through the strings using nested loops.
The outer loop processes characters from text1.
The inner loop processes characters from text2.
5. For each pair of indices (i, j):
Compare:
text1[i - 1]text2[j - 1]
We use i - 1 and j - 1 because the DP table has an extra leading row and column.
6. If the characters match:
Add 1 to the result from the previous diagonal cell:
dp[i][j] = dp[i - 1][j - 1] + 1
The diagonal represents the best LCS before including the matching character. 7. If the characters do not match:
Take the maximum of:
- skipping the current character in
text1 - skipping the current character in
text2
Formula:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
8. Continue until the table is fully populated.
9. The final answer is stored in:
dp[n][m]
This represents the LCS length for the entire strings.
Why it works
The dynamic programming solution works because every optimal subsequence problem can be reduced to smaller optimal subsequence problems.
If two characters match, any optimal subsequence ending at those characters must include them, so we extend the previous optimal solution.
If they do not match, at least one of the characters cannot belong to the same subsequence, so we try both possibilities and keep the better result.
Since every state depends only on previously solved smaller states, filling the table bottom-up guarantees correctness.
Python Solution
class Solution:
def longestCommonSubsequence(self, text1: str, text2: str) -> int:
n = len(text1)
m = len(text2)
dp = [[0] * (m + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(1, m + 1):
if text1[i - 1] == text2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
return dp[n][m]
The implementation begins by determining the lengths of both strings.
The dp table is initialized with dimensions (n + 1) x (m + 1). Every value starts at 0, which correctly represents the LCS involving an empty prefix.
The nested loops iterate through every pair of positions in the two strings.
When characters match, we extend the previous subsequence length from the diagonal cell.
When they do not match, we inherit the best result from either removing a character from text1 or removing a character from text2.
Finally, the bottom-right cell contains the length of the longest common subsequence for the complete strings.
Go Solution
func longestCommonSubsequence(text1 string, text2 string) int {
n := len(text1)
m := len(text2)
dp := make([][]int, n+1)
for i := range dp {
dp[i] = make([]int, m+1)
}
for i := 1; i <= n; i++ {
for j := 1; j <= m; j++ {
if text1[i-1] == text2[j-1] {
dp[i][j] = dp[i-1][j-1] + 1
} else {
if dp[i-1][j] > dp[i][j-1] {
dp[i][j] = dp[i-1][j]
} else {
dp[i][j] = dp[i][j-1]
}
}
}
}
return dp[n][m]
}
The Go implementation follows the same logic as the Python version.
Instead of list comprehensions, Go uses nested slices to create the DP table.
Go strings are byte indexed, which is safe here because the problem guarantees lowercase English letters only.
No special handling for nil values is required because the DP table is fully initialized before use.
Worked Examples
Example 1
Input:
text1 = "abcde"
text2 = "ace"
We build the DP table step by step.
| i | j | text1[i-1] | text2[j-1] | Action | dp[i][j] |
|---|---|---|---|---|---|
| 1 | 1 | a | a | match | 1 |
| 1 | 2 | a | c | max(top,left) | 1 |
| 1 | 3 | a | e | max(top,left) | 1 |
| 2 | 1 | b | a | max(top,left) | 1 |
| 2 | 2 | b | c | max(top,left) | 1 |
| 2 | 3 | b | e | max(top,left) | 1 |
| 3 | 1 | c | a | max(top,left) | 1 |
| 3 | 2 | c | c | match | 2 |
| 3 | 3 | c | e | max(top,left) | 2 |
| 4 | 1 | d | a | max(top,left) | 1 |
| 4 | 2 | d | c | max(top,left) | 2 |
| 4 | 3 | d | e | max(top,left) | 2 |
| 5 | 1 | e | a | max(top,left) | 1 |
| 5 | 2 | e | c | max(top,left) | 2 |
| 5 | 3 | e | e | match | 3 |
Final answer:
3
Example 2
Input:
text1 = "abc"
text2 = "abc"
Every character matches in order.
DP progression:
| i | j | Match | dp[i][j] |
|---|---|---|---|
| 1 | 1 | a == a | 1 |
| 2 | 2 | b == b | 2 |
| 3 | 3 | c == c | 3 |
Final answer:
3
Example 3
Input:
text1 = "abc"
text2 = "def"
No characters match.
Every DP cell remains 0.
Final answer:
0
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n * m) | Every pair of indices is processed once |
| Space | O(n * m) | The DP table stores results for all subproblems |
The algorithm processes each DP state exactly once. Since there are n * m states, the time complexity is quadratic in the lengths of the strings.
The space complexity is also quadratic because we store the entire DP table. This can be optimized to O(min(n, m)) using rolling arrays, but the standard 2D solution is clearer and easier to understand.
Test Cases
solution = Solution()
assert solution.longestCommonSubsequence("abcde", "ace") == 3
# standard example with skipped characters
assert solution.longestCommonSubsequence("abc", "abc") == 3
# identical strings
assert solution.longestCommonSubsequence("abc", "def") == 0
# no common subsequence
assert solution.longestCommonSubsequence("a", "a") == 1
# smallest matching strings
assert solution.longestCommonSubsequence("a", "b") == 0
# smallest non-matching strings
assert solution.longestCommonSubsequence("aaaa", "aa") == 2
# repeated characters
assert solution.longestCommonSubsequence("ezupkr", "ubmrapg") == 2
# non-contiguous subsequence
assert solution.longestCommonSubsequence("bl", "yby") == 1
# partial overlap
assert solution.longestCommonSubsequence("abcdef", "fbdamn") == 2
# subsequence appears out of contiguous order
assert solution.longestCommonSubsequence("oxcpqrsvwf", "shmtulqrypy") == 2
# complex overlapping choices
| Test | Why |
|---|---|
"abcde", "ace" |
Verifies normal subsequence matching |
"abc", "abc" |
Verifies complete match |
"abc", "def" |
Verifies no overlap |
"a", "a" |
Smallest matching input |
"a", "b" |
Smallest non-matching input |
"aaaa", "aa" |
Verifies repeated character handling |
"ezupkr", "ubmrapg" |
Verifies non-contiguous matching |
"bl", "yby" |
Verifies partial overlap |
"abcdef", "fbdamn" |
Verifies order preservation |
"oxcpqrsvwf", "shmtulqrypy" |
Verifies complex DP transitions |
Edge Cases
Completely Different Strings
When the two strings share no characters, the answer should be 0.
Example:
text1 = "abc"
text2 = "def"
A buggy implementation might accidentally count partial matches or fail to initialize the DP table correctly.
This implementation handles the case naturally because no character matches occur, so every DP cell remains 0.
Repeated Characters
Repeated characters can create many overlapping possibilities.
Example:
text1 = "aaaa"
text2 = "aa"
A greedy approach might incorrectly reuse characters or count too many matches.
The DP solution avoids this problem because each state represents a precise prefix comparison, ensuring characters are matched in valid order without duplication.
One String Contained Inside Another
Sometimes the entire smaller string is already a subsequence of the larger string.
Example:
text1 = "abcde"
text2 = "ace"
The algorithm must correctly skip irrelevant characters while preserving order.
The DP recurrence naturally supports skipping characters through the transition:
max(dp[i - 1][j], dp[i][j - 1])
This guarantees that unnecessary characters are ignored while preserving the best possible subsequence length.