LeetCode 712 - Minimum ASCII Delete Sum for Two Strings

Difficulty: 🟡 Medium
Topics: String, Dynamic Programming

Solution

Problem Understanding

The problem gives us two strings, s1 and s2, and asks us to make them equal by deleting characters from either string. Every deleted character contributes its ASCII value to the total cost. Our goal is to minimize this total deletion cost.

For example, if we delete the character 's', we add 115 to the total because the ASCII value of 's' is 115. If we delete 't', we add 116.

The important detail is that we are only allowed to delete characters. We cannot insert, replace, or rearrange characters. After performing deletions, the remaining characters in both strings must form exactly the same sequence.

This is fundamentally an optimization problem over two strings. We need to determine which characters should be kept and which should be removed so that the resulting strings match while minimizing the sum of deleted ASCII values.

The input constraints are important:

1 <= s1.length, s2.length <= 1000
Strings contain only lowercase English letters

Since each string can have length up to 1000, any exponential solution is infeasible. A quadratic dynamic programming solution is acceptable because 1000 * 1000 = 1,000,000, which is manageable.

Several edge cases are important:

The two strings may already be equal, meaning the answer is 0.
The strings may share no common characters at all, forcing us to delete every character from both strings.
One string may be much longer than the other.
Multiple valid common subsequences may exist, and we must choose the one that minimizes deletion cost rather than simply maximizing length.

The problem is closely related to the Longest Common Subsequence problem, but instead of maximizing character count, we minimize deletion cost measured using ASCII values.

Approaches

Brute Force Approach

A brute force solution would try every possible way of deleting characters from both strings and check whether the resulting strings are equal.

One way to think about this is:

Generate every subsequence of s1
Generate every subsequence of s2
Compare all pairs of subsequences
For every matching pair, compute the deletion cost
Return the minimum cost

This works because every possible final equal string is considered.

However, each string of length n has 2^n subsequences. With lengths up to 1000, this becomes completely impossible. Even for strings of length 20, brute force would already be enormous.

The brute force approach is correct because it exhaustively explores every valid deletion combination, but its exponential time complexity makes it unusable.

Key Insight

The key observation is that at every position (i, j) in the two strings, we only need to answer one question:

What is the minimum deletion cost needed to make the suffixes s1[i:] and s2[j:] equal?

This naturally leads to dynamic programming.

For two characters:

If s1[i] == s2[j], we should keep both characters because deleting matching characters would only increase the cost unnecessarily.
Otherwise, we have two choices:
Delete s1[i]
Delete s2[j]

We recursively choose the cheaper option.

This overlapping subproblem structure is exactly what dynamic programming is designed for.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(2^(m+n))	O(2^(m+n))	Tries every subsequence combination
Optimal Dynamic Programming	O(m * n)	O(m * n)	Stores minimum deletion cost for suffix pairs

Algorithm Walkthrough

Dynamic Programming Definition

We define:

dp[i][j]

as the minimum ASCII deletion sum needed to make:

s1[i:]
s2[j:]

equal.

The answer we want is:

dp[0][0]

Step by Step Algorithm

Create a DP table of size (m + 1) x (n + 1).

Here:

m = len(s1)
n = len(s2)

The extra row and column handle empty suffixes cleanly. 2. Initialize the last column.

If s2 is exhausted, then all remaining characters in s1 must be deleted.

So:

dp[i][n] = ASCII sum of s1[i:] 3. Initialize the last row.

If s1 is exhausted, then all remaining characters in s2 must be deleted.

So:

dp[m][j] = ASCII sum of s2[j:] 4. Fill the table from bottom-right toward top-left.

We process backward because each state depends on future states. 5. For each position (i, j):

If the characters match:

s1[i] == s2[j]

then no deletion is needed for these characters.

We reuse the answer for the next suffixes:

dp[i][j] = dp[i+1][j+1] 6. Otherwise, we consider two choices:

Delete s1[i]
Delete s2[j]

The costs are:

ord(s1[i]) + dp[i+1][j]
ord(s2[j]) + dp[i][j+1]

We choose the minimum. 7. Return dp[0][0].

Why it works

The DP state always represents the optimal minimum deletion cost for the remaining suffixes. At every mismatch, the algorithm explores the only two valid operations, deleting from the first string or deleting from the second string. Since every state uses already-computed optimal subproblem answers, the final result is globally optimal by dynamic programming optimal substructure.

Python Solution

class Solution:
    def minimumDeleteSum(self, s1: str, s2: str) -> int:
        m = len(s1)
        n = len(s2)

        dp = [[0] * (n + 1) for _ in range(m + 1)]

        # Fill last column
        for i in range(m - 1, -1, -1):
            dp[i][n] = dp[i + 1][n] + ord(s1[i])

        # Fill last row
        for j in range(n - 1, -1, -1):
            dp[m][j] = dp[m][j + 1] + ord(s2[j])

        # Fill DP table
        for i in range(m - 1, -1, -1):
            for j in range(n - 1, -1, -1):
                if s1[i] == s2[j]:
                    dp[i][j] = dp[i + 1][j + 1]
                else:
                    delete_s1 = ord(s1[i]) + dp[i + 1][j]
                    delete_s2 = ord(s2[j]) + dp[i][j + 1]

                    dp[i][j] = min(delete_s1, delete_s2)

        return dp[0][0]

The implementation begins by determining the lengths of both strings and allocating the DP table.

The last column initialization handles the situation where s2 is already empty. In that case, every remaining character in s1 must be deleted, so we accumulate ASCII values backward.

The last row initialization does the symmetric operation for s2.

The nested loops then process states in reverse order so that all required future states are already computed before the current state is evaluated.

When characters match, we keep them because deleting matching characters can never improve the answer. When characters differ, we compute the cost of deleting from either string and choose the smaller value.

Finally, dp[0][0] contains the minimum deletion sum for the full strings.

Go Solution

func minimumDeleteSum(s1 string, s2 string) int {
    m := len(s1)
    n := len(s2)

    dp := make([][]int, m+1)

    for i := range dp {
        dp[i] = make([]int, n+1)
    }

    // Fill last column
    for i := m - 1; i >= 0; i-- {
        dp[i][n] = dp[i+1][n] + int(s1[i])
    }

    // Fill last row
    for j := n - 1; j >= 0; j-- {
        dp[m][j] = dp[m][j+1] + int(s2[j])
    }

    // Fill DP table
    for i := m - 1; i >= 0; i-- {
        for j := n - 1; j >= 0; j-- {
            if s1[i] == s2[j] {
                dp[i][j] = dp[i+1][j+1]
            } else {
                deleteS1 := int(s1[i]) + dp[i+1][j]
                deleteS2 := int(s2[j]) + dp[i][j+1]

                if deleteS1 < deleteS2 {
                    dp[i][j] = deleteS1
                } else {
                    dp[i][j] = deleteS2
                }
            }
        }
    }

    return dp[0][0]
}

The Go implementation follows the exact same DP logic as the Python solution.

One important language-specific detail is that Go strings are byte slices for ASCII characters, so s1[i] directly gives the byte value. Converting it with int() gives the ASCII value.

The DP table is created using nested slices because Go does not support dynamically sized multidimensional arrays directly.

Since ASCII values and string lengths are small, integer overflow is not a concern.

Worked Examples

Example 1

Input:

s1 = "sea"
s2 = "eat"

DP Initialization

We first compute base cases.

ASCII values:

Character	ASCII
s	115
e	101
a	97
t	116

Last Column

i	Remaining String	Cost
3	""	0
2	"a"	97
1	"ea"	198
0	"sea"	313

Last Row

j	Remaining String	Cost
3	""	0
2	"t"	116
1	"at"	213
0	"eat"	314

Filling the Table

We process from bottom-right upward.

State `(2, 2)`

Compare:

s1[2] = 'a'
s2[2] = 't'

Not equal.

Delete 'a': 97 + 116 = 213
Delete 't': 116 + 97 = 213

So:

dp[2][2] = 213

State `(2, 1)`

'a' == 'a'

So:

dp[2][1] = dp[3][2] = 116

State `(1, 0)`

'e' == 'e'

So:

dp[1][0] = dp[2][1] = 116

Final State `(0, 0)`

's' != 'e'

Choices:

Delete 's': 115 + 116 = 231
Delete 'e': 101 + 332 = 433

Minimum:

Answer:

Example 2

Input:

s1 = "delete"
s2 = "leet"

The optimal common string is:

"let"

Deleted characters:

From "delete":

d + e + e = 100 + 101 + 101 = 302

From "leet":

e = 101

Total:

The DP table systematically evaluates every suffix pair and discovers that keeping "let" minimizes total ASCII deletion cost.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(m * n)	Every DP state is computed once
Space	O(m * n)	DP table stores all suffix states

The algorithm processes every pair of indices (i, j) exactly once. Since there are m * n such states, the runtime is quadratic.

The space complexity is also quadratic because we store a full DP table. This can be optimized to O(n) using rolling arrays, but the full table version is easier to understand and debug.

Test Cases

sol = Solution()

assert sol.minimumDeleteSum("sea", "eat") == 231
# Basic example from the problem statement

assert sol.minimumDeleteSum("delete", "leet") == 403
# Larger example with multiple valid subsequences

assert sol.minimumDeleteSum("a", "a") == 0
# Identical single-character strings

assert sol.minimumDeleteSum("a", "b") == 195
# No common characters, delete both

assert sol.minimumDeleteSum("abc", "abc") == 0
# Already equal strings

assert sol.minimumDeleteSum("abc", "def") == (
    ord('a') + ord('b') + ord('c') +
    ord('d') + ord('e') + ord('f')
)
# Completely disjoint strings

assert sol.minimumDeleteSum("ab", "a") == ord('b')
# One extra character in first string

assert sol.minimumDeleteSum("a", "ab") == ord('b')
# One extra character in second string

assert sol.minimumDeleteSum("xxyyzz", "xxyz") == (
    ord('y') + ord('z')
)
# Repeated characters

assert sol.minimumDeleteSum("aaaa", "aa") == 2 * ord('a')
# Multiple identical deletions required

Test	Why
`"sea", "eat"`	Validates standard example
`"delete", "leet"`	Validates complex DP behavior
`"a", "a"`	Tests already equal strings
`"a", "b"`	Tests no matching characters
`"abc", "abc"`	Ensures zero deletion cost
`"abc", "def"`	Tests total deletion of both strings
`"ab", "a"`	Tests deleting from first string only
`"a", "ab"`	Tests deleting from second string only
`"xxyyzz", "xxyz"`	Tests repeated characters
`"aaaa", "aa"`	Tests duplicate matching decisions

Edge Cases

One important edge case occurs when the two strings are already identical. In this situation, the correct answer is 0 because no deletions are necessary. A buggy implementation might still attempt unnecessary deletions if it does not correctly handle matching characters. The DP transition dp[i][j] = dp[i+1][j+1] ensures matching characters are preserved at zero additional cost.

Another important case occurs when the strings share no common characters at all. For example, "abc" and "def" have no overlap, so every character from both strings must be deleted. Some incorrect implementations accidentally assume at least one common subsequence exists. This implementation correctly handles the situation because every mismatch evaluates both deletion choices until all characters are removed.

Repeated characters are another subtle edge case. For example, "aaaa" and "aa" require deleting exactly two 'a' characters. Greedy approaches can fail here because choosing the wrong matching positions may produce unnecessary deletions later. Dynamic programming avoids this problem by evaluating all suffix combinations systematically and always selecting the globally optimal cost.

A final edge case involves strings of very different lengths. For example, "a" and "abcdefghijklmnopqrstuvwxyz" require deleting almost the entire longer string. The base row and base column initialization correctly handle these situations by accumulating remaining ASCII values when one string becomes exhausted.