LeetCode 712 - Minimum ASCII Delete Sum for Two Strings
The problem gives us two strings, s1 and s2, and asks us to make them equal by deleting characters from either string. Every deleted character contributes its ASCII value to the total cost. Our goal is to minimize this total deletion cost.
Difficulty: 🟡 Medium
Topics: String, Dynamic Programming
Solution
Problem Understanding
The problem gives us two strings, s1 and s2, and asks us to make them equal by deleting characters from either string. Every deleted character contributes its ASCII value to the total cost. Our goal is to minimize this total deletion cost.
For example, if we delete the character 's', we add 115 to the total because the ASCII value of 's' is 115. If we delete 't', we add 116.
The important detail is that we are only allowed to delete characters. We cannot insert, replace, or rearrange characters. After performing deletions, the remaining characters in both strings must form exactly the same sequence.
This is fundamentally an optimization problem over two strings. We need to determine which characters should be kept and which should be removed so that the resulting strings match while minimizing the sum of deleted ASCII values.
The input constraints are important:
1 <= s1.length, s2.length <= 1000- Strings contain only lowercase English letters
Since each string can have length up to 1000, any exponential solution is infeasible. A quadratic dynamic programming solution is acceptable because 1000 * 1000 = 1,000,000, which is manageable.
Several edge cases are important:
- The two strings may already be equal, meaning the answer is
0. - The strings may share no common characters at all, forcing us to delete every character from both strings.
- One string may be much longer than the other.
- Multiple valid common subsequences may exist, and we must choose the one that minimizes deletion cost rather than simply maximizing length.
The problem is closely related to the Longest Common Subsequence problem, but instead of maximizing character count, we minimize deletion cost measured using ASCII values.
Approaches
Brute Force Approach
A brute force solution would try every possible way of deleting characters from both strings and check whether the resulting strings are equal.
One way to think about this is:
- Generate every subsequence of
s1 - Generate every subsequence of
s2 - Compare all pairs of subsequences
- For every matching pair, compute the deletion cost
- Return the minimum cost
This works because every possible final equal string is considered.
However, each string of length n has 2^n subsequences. With lengths up to 1000, this becomes completely impossible. Even for strings of length 20, brute force would already be enormous.
The brute force approach is correct because it exhaustively explores every valid deletion combination, but its exponential time complexity makes it unusable.
Key Insight
The key observation is that at every position (i, j) in the two strings, we only need to answer one question:
What is the minimum deletion cost needed to make the suffixes s1[i:] and s2[j:] equal?
This naturally leads to dynamic programming.
For two characters:
-
If
s1[i] == s2[j], we should keep both characters because deleting matching characters would only increase the cost unnecessarily. -
Otherwise, we have two choices:
-
Delete
s1[i] -
Delete
s2[j]
We recursively choose the cheaper option.
This overlapping subproblem structure is exactly what dynamic programming is designed for.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^(m+n)) | O(2^(m+n)) | Tries every subsequence combination |
| Optimal Dynamic Programming | O(m * n) | O(m * n) | Stores minimum deletion cost for suffix pairs |
Algorithm Walkthrough
Dynamic Programming Definition
We define:
dp[i][j]
as the minimum ASCII deletion sum needed to make:
s1[i:]s2[j:]
equal.
The answer we want is:
dp[0][0]
Step by Step Algorithm
- Create a DP table of size
(m + 1) x (n + 1).
Here:
m = len(s1)n = len(s2)
The extra row and column handle empty suffixes cleanly. 2. Initialize the last column.
If s2 is exhausted, then all remaining characters in s1 must be deleted.
So:
dp[i][n] = ASCII sum of s1[i:]
3. Initialize the last row.
If s1 is exhausted, then all remaining characters in s2 must be deleted.
So:
dp[m][j] = ASCII sum of s2[j:]
4. Fill the table from bottom-right toward top-left.
We process backward because each state depends on future states.
5. For each position (i, j):
If the characters match:
s1[i] == s2[j]
then no deletion is needed for these characters.
We reuse the answer for the next suffixes:
dp[i][j] = dp[i+1][j+1]
6. Otherwise, we consider two choices:
- Delete
s1[i] - Delete
s2[j]
The costs are:
ord(s1[i]) + dp[i+1][j]ord(s2[j]) + dp[i][j+1]
We choose the minimum.
7. Return dp[0][0].
Why it works
The DP state always represents the optimal minimum deletion cost for the remaining suffixes. At every mismatch, the algorithm explores the only two valid operations, deleting from the first string or deleting from the second string. Since every state uses already-computed optimal subproblem answers, the final result is globally optimal by dynamic programming optimal substructure.
Python Solution
class Solution:
def minimumDeleteSum(self, s1: str, s2: str) -> int:
m = len(s1)
n = len(s2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
# Fill last column
for i in range(m - 1, -1, -1):
dp[i][n] = dp[i + 1][n] + ord(s1[i])
# Fill last row
for j in range(n - 1, -1, -1):
dp[m][j] = dp[m][j + 1] + ord(s2[j])
# Fill DP table
for i in range(m - 1, -1, -1):
for j in range(n - 1, -1, -1):
if s1[i] == s2[j]:
dp[i][j] = dp[i + 1][j + 1]
else:
delete_s1 = ord(s1[i]) + dp[i + 1][j]
delete_s2 = ord(s2[j]) + dp[i][j + 1]
dp[i][j] = min(delete_s1, delete_s2)
return dp[0][0]
The implementation begins by determining the lengths of both strings and allocating the DP table.
The last column initialization handles the situation where s2 is already empty. In that case, every remaining character in s1 must be deleted, so we accumulate ASCII values backward.
The last row initialization does the symmetric operation for s2.
The nested loops then process states in reverse order so that all required future states are already computed before the current state is evaluated.
When characters match, we keep them because deleting matching characters can never improve the answer. When characters differ, we compute the cost of deleting from either string and choose the smaller value.
Finally, dp[0][0] contains the minimum deletion sum for the full strings.
Go Solution
func minimumDeleteSum(s1 string, s2 string) int {
m := len(s1)
n := len(s2)
dp := make([][]int, m+1)
for i := range dp {
dp[i] = make([]int, n+1)
}
// Fill last column
for i := m - 1; i >= 0; i-- {
dp[i][n] = dp[i+1][n] + int(s1[i])
}
// Fill last row
for j := n - 1; j >= 0; j-- {
dp[m][j] = dp[m][j+1] + int(s2[j])
}
// Fill DP table
for i := m - 1; i >= 0; i-- {
for j := n - 1; j >= 0; j-- {
if s1[i] == s2[j] {
dp[i][j] = dp[i+1][j+1]
} else {
deleteS1 := int(s1[i]) + dp[i+1][j]
deleteS2 := int(s2[j]) + dp[i][j+1]
if deleteS1 < deleteS2 {
dp[i][j] = deleteS1
} else {
dp[i][j] = deleteS2
}
}
}
}
return dp[0][0]
}
The Go implementation follows the exact same DP logic as the Python solution.
One important language-specific detail is that Go strings are byte slices for ASCII characters, so s1[i] directly gives the byte value. Converting it with int() gives the ASCII value.
The DP table is created using nested slices because Go does not support dynamically sized multidimensional arrays directly.
Since ASCII values and string lengths are small, integer overflow is not a concern.
Worked Examples
Example 1
Input:
s1 = "sea"
s2 = "eat"
DP Initialization
We first compute base cases.
ASCII values:
| Character | ASCII |
|---|---|
| s | 115 |
| e | 101 |
| a | 97 |
| t | 116 |
Last Column
| i | Remaining String | Cost |
|---|---|---|
| 3 | "" | 0 |
| 2 | "a" | 97 |
| 1 | "ea" | 198 |
| 0 | "sea" | 313 |
Last Row
| j | Remaining String | Cost |
|---|---|---|
| 3 | "" | 0 |
| 2 | "t" | 116 |
| 1 | "at" | 213 |
| 0 | "eat" | 314 |
Filling the Table
We process from bottom-right upward.
State (2, 2)
Compare:
s1[2] = 'a'
s2[2] = 't'
Not equal.
Delete 'a': 97 + 116 = 213
Delete 't': 116 + 97 = 213
So:
dp[2][2] = 213
State (2, 1)
'a' == 'a'
So:
dp[2][1] = dp[3][2] = 116
State (1, 0)
'e' == 'e'
So:
dp[1][0] = dp[2][1] = 116
Final State (0, 0)
's' != 'e'
Choices:
Delete 's': 115 + 116 = 231
Delete 'e': 101 + 332 = 433
Minimum:
231
Answer:
231
Example 2
Input:
s1 = "delete"
s2 = "leet"
The optimal common string is:
"let"
Deleted characters:
From "delete":
d + e + e = 100 + 101 + 101 = 302
From "leet":
e = 101
Total:
403
The DP table systematically evaluates every suffix pair and discovers that keeping "let" minimizes total ASCII deletion cost.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(m * n) | Every DP state is computed once |
| Space | O(m * n) | DP table stores all suffix states |
The algorithm processes every pair of indices (i, j) exactly once. Since there are m * n such states, the runtime is quadratic.
The space complexity is also quadratic because we store a full DP table. This can be optimized to O(n) using rolling arrays, but the full table version is easier to understand and debug.
Test Cases
sol = Solution()
assert sol.minimumDeleteSum("sea", "eat") == 231
# Basic example from the problem statement
assert sol.minimumDeleteSum("delete", "leet") == 403
# Larger example with multiple valid subsequences
assert sol.minimumDeleteSum("a", "a") == 0
# Identical single-character strings
assert sol.minimumDeleteSum("a", "b") == 195
# No common characters, delete both
assert sol.minimumDeleteSum("abc", "abc") == 0
# Already equal strings
assert sol.minimumDeleteSum("abc", "def") == (
ord('a') + ord('b') + ord('c') +
ord('d') + ord('e') + ord('f')
)
# Completely disjoint strings
assert sol.minimumDeleteSum("ab", "a") == ord('b')
# One extra character in first string
assert sol.minimumDeleteSum("a", "ab") == ord('b')
# One extra character in second string
assert sol.minimumDeleteSum("xxyyzz", "xxyz") == (
ord('y') + ord('z')
)
# Repeated characters
assert sol.minimumDeleteSum("aaaa", "aa") == 2 * ord('a')
# Multiple identical deletions required
| Test | Why |
|---|---|
"sea", "eat" |
Validates standard example |
"delete", "leet" |
Validates complex DP behavior |
"a", "a" |
Tests already equal strings |
"a", "b" |
Tests no matching characters |
"abc", "abc" |
Ensures zero deletion cost |
"abc", "def" |
Tests total deletion of both strings |
"ab", "a" |
Tests deleting from first string only |
"a", "ab" |
Tests deleting from second string only |
"xxyyzz", "xxyz" |
Tests repeated characters |
"aaaa", "aa" |
Tests duplicate matching decisions |
Edge Cases
One important edge case occurs when the two strings are already identical. In this situation, the correct answer is 0 because no deletions are necessary. A buggy implementation might still attempt unnecessary deletions if it does not correctly handle matching characters. The DP transition dp[i][j] = dp[i+1][j+1] ensures matching characters are preserved at zero additional cost.
Another important case occurs when the strings share no common characters at all. For example, "abc" and "def" have no overlap, so every character from both strings must be deleted. Some incorrect implementations accidentally assume at least one common subsequence exists. This implementation correctly handles the situation because every mismatch evaluates both deletion choices until all characters are removed.
Repeated characters are another subtle edge case. For example, "aaaa" and "aa" require deleting exactly two 'a' characters. Greedy approaches can fail here because choosing the wrong matching positions may produce unnecessary deletions later. Dynamic programming avoids this problem by evaluating all suffix combinations systematically and always selecting the globally optimal cost.
A final edge case involves strings of very different lengths. For example, "a" and "abcdefghijklmnopqrstuvwxyz" require deleting almost the entire longer string. The base row and base column initialization correctly handle these situations by accumulating remaining ASCII values when one string becomes exhausted.