LeetCode 583 - Delete Operation for Two Strings
The problem asks us to determine the minimum number of deletion steps required to make two strings identical. We are given two input strings, word1 and word2, and in each step, we can delete exactly one character from either string.
Difficulty: 🟡 Medium
Topics: String, Dynamic Programming
Solution
Problem Understanding
The problem asks us to determine the minimum number of deletion steps required to make two strings identical. We are given two input strings, word1 and word2, and in each step, we can delete exactly one character from either string. The goal is to transform both strings so that their final forms are identical using as few deletions as possible.
For example, given word1 = "sea" and word2 = "eat", one optimal sequence is to delete 's' from word1 to get "ea" and delete 't' from word2 to also get "ea". This requires a total of 2 deletions.
The constraints tell us that both strings have a length between 1 and 500 and only contain lowercase English letters. This implies that a solution that is quadratic in time is acceptable, but exponential solutions will be too slow.
Important edge cases include strings with no common characters (all letters must be deleted from both strings), strings that are already identical (no deletion is needed), and strings of length 1 (trivial but must be handled).
Approaches
The brute-force approach would consider all possible sequences of deletions from both strings and choose the one with the minimum total deletions. Conceptually, this could be modeled as recursively trying every option: delete the first character from word1, delete the first character from word2, or keep both if they match. While this approach is correct, its complexity is exponential, $O(2^{m+n})$, and infeasible for strings up to length 500.
The optimal approach relies on a key observation: the minimum number of deletions is determined by the length of the longest common subsequence (LCS) of the two strings. Once we know the LCS length, say lcs_len, the minimum deletions required are (len(word1) - lcs_len) + (len(word2) - lcs_len). The intuition is that characters in the LCS are preserved in both strings, so only characters outside the LCS need deletion. We can efficiently compute the LCS using dynamic programming.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^(m+n)) | O(m+n) recursion stack | Explore all deletion sequences recursively |
| Optimal (DP using LCS) | O(m*n) | O(m*n) | Compute LCS with 2D DP array, then calculate deletions |
Algorithm Walkthrough
- Initialize a 2D array
dpof size(m+1) x (n+1), wheremandnare the lengths ofword1andword2.dp[i][j]represents the length of the LCS of the firsticharacters ofword1and the firstjcharacters ofword2. - Set all values in the first row and first column to 0, since the LCS of any string with an empty string is 0.
- Iterate through the characters of
word1andword2. For eachiandj:
- If
word1[i-1] == word2[j-1], thendp[i][j] = dp[i-1][j-1] + 1because the matching character extends the LCS. - Otherwise,
dp[i][j] = max(dp[i-1][j], dp[i][j-1])because we must skip one character from either string to maximize the LCS length.
- After filling the DP table,
dp[m][n]gives the length of the LCS. - Compute the minimum deletions as
(m - dp[m][n]) + (n - dp[m][n]).
Why it works: The algorithm correctly computes the LCS length by building on smaller subproblems in a bottom-up fashion. Since characters not in the LCS must be deleted to make the strings equal, subtracting the LCS length from each string’s length gives the exact number of deletions required.
Python Solution
class Solution:
def minDistance(self, word1: str, word2: str) -> int:
m, n = len(word1), len(word2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
lcs_len = dp[m][n]
return (m - lcs_len) + (n - lcs_len)
The Python code creates a DP table dp to store the LCS lengths for all subproblems. Each entry is filled based on whether the current characters match or not. The final answer is computed by subtracting the LCS length from each string length and summing the results.
Go Solution
func minDistance(word1 string, word2 string) int {
m, n := len(word1), len(word2)
dp := make([][]int, m+1)
for i := 0; i <= m; i++ {
dp[i] = make([]int, n+1)
}
for i := 1; i <= m; i++ {
for j := 1; j <= n; j++ {
if word1[i-1] == word2[j-1] {
dp[i][j] = dp[i-1][j-1] + 1
} else {
if dp[i-1][j] > dp[i][j-1] {
dp[i][j] = dp[i-1][j]
} else {
dp[i][j] = dp[i][j-1]
}
}
}
}
lcsLen := dp[m][n]
return (m - lcsLen) + (n - lcsLen)
}
The Go solution mirrors the Python implementation. Differences include explicit allocation of a 2D slice, and handling of the max operation manually because Go lacks a built-in max for integers.
Worked Examples
Example 1: word1 = "sea", word2 = "eat"
| i | j | dp[i][j] value | Explanation |
|---|---|---|---|
| 1 | 1 | 0 | 's' != 'e' |
| 1 | 2 | 0 | 's' != 'a' |
| 1 | 3 | 0 | 's' != 't' |
| 2 | 1 | 0 | 'e' == 'e', extend LCS |
| 2 | 2 | 1 | 'e' != 'a' |
| 3 | 2 | 1 | 'a' == 'a', LCS becomes 2 |
Final dp[3][3] = 2, so deletions = (3-2) + (3-2) = 2.
Example 2: word1 = "leetcode", word2 = "etco"
After DP computation, LCS = "etco" length 4. Deletions = (8-4) + (4-4) = 4.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(m*n) | Two nested loops over the lengths of word1 and word2 |
| Space | O(m*n) | Storing the DP table of size (m+1) x (n+1) |
The time complexity is dominated by filling the DP table. Space complexity is also quadratic, but this can be optimized to O(min(m,n)) using a single row with rolling updates if needed.
Test Cases
# provided examples
assert Solution().minDistance("sea", "eat") == 2 # basic example
assert Solution().minDistance("leetcode", "etco") == 4 # larger string
# edge cases
assert Solution().minDistance("a", "a") == 0 # identical single letters
assert Solution().minDistance("a", "b") == 2 # completely different single letters
assert Solution().minDistance("", "abc") == 3 # one empty string
assert Solution().minDistance("abc", "") == 3 # other empty string
assert Solution().minDistance("", "") == 0 # both empty
# stress cases
assert Solution().minDistance("a"*500, "a"*500) == 0 # max length, identical
assert Solution().minDistance("a"*500, "b"*500) == 1000 # max length, no common characters
| Test | Why |
|---|---|
| "sea", "eat" | Standard example, checks basic DP calculation |
| "leetcode", "etco" | Larger input, multiple deletions needed |
| "a", "a" | Single-character match |
| "a", "b" | Single-character mismatch |
| "", "abc" | One string empty |
| "", "" | Both strings empty |
| " |