LeetCode 72 - Edit Distance

Difficulty: 🟡 Medium
Topics: String, Dynamic Programming

Solution

Problem Understanding

The problem asks us to compute the minimum number of edit operations needed to transform one string into another. The allowed operations are insertion, deletion, and replacement of a single character. Each operation has a cost of exactly one.

We are given two input strings, word1 and word2. The goal is to determine the smallest possible number of operations required to convert word1 into word2.

For example, if word1 = "horse" and word2 = "ros", one optimal sequence is:

Replace 'h' with 'r'
Delete one 'r'
Delete 'e'

This takes three operations, so the answer is 3.

The constraints tell us that each string can have length up to 500. This is large enough that an exponential brute-force search would be far too slow. A solution with quadratic time complexity, such as O(m * n) where m and n are the lengths of the two strings, is acceptable because 500 * 500 = 250000, which is manageable.

Several edge cases are important:

One or both strings may be empty.
The strings may already be identical.
Every character may need replacement.
One string may be much longer than the other.
Multiple edit sequences may exist, but we only care about the minimum number of operations.

The problem guarantees that the strings contain only lowercase English letters, so we do not need to handle Unicode or special character behavior.

Approaches

Brute Force Approach

A brute-force solution would try every possible sequence of insertions, deletions, and replacements until the first string becomes the second string. At each position, we could branch into several possibilities:

Insert a character
Delete a character
Replace a character
Keep the character if it already matches

This recursive search eventually explores all valid edit sequences and returns the minimum cost.

The brute-force approach is correct because it considers every possible transformation path. However, it is extremely inefficient. The same subproblems are recomputed repeatedly. For example, converting "orse" to "ros" might be solved many times from different recursive branches.

The number of recursive states grows exponentially, making this approach infeasible for strings of length up to 500.

Optimal Dynamic Programming Approach

The key insight is that the problem has overlapping subproblems and optimal substructure.

Suppose we want to compute the minimum edits needed to convert:

word1[0:i]

into:

word2[0:j]

If the last characters match, then no operation is needed for those characters, and the answer depends entirely on the smaller prefixes.

If the last characters differ, then we must perform one of three operations:

Insert
Delete
Replace

Each operation reduces the problem into a smaller subproblem.

This structure makes dynamic programming a natural fit. We build a table where each entry stores the minimum edit distance between prefixes of the two strings.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(3^(m+n))	O(m+n)	Explores all possible edit sequences recursively
Optimal Dynamic Programming	O(m * n)	O(m * n)	Stores results for all prefix pairs

Algorithm Walkthrough

Dynamic Programming Definition

Let:

dp[i][j]

represent the minimum number of operations required to convert:

word1[:i]

into:

word2[:j]

The table has dimensions:

(len(word1) + 1) x (len(word2) + 1)

The extra row and column handle empty prefixes.

Step-by-Step Algorithm

Initialize the DP table.

Create a 2D table filled with zeros. The table size is (m + 1) x (n + 1) where m and n are the lengths of the two strings. 2. Fill the first column.

Converting a non-empty prefix of word1 into an empty string requires deleting every character.

Therefore:

dp[i][0] = i

Fill the first row.

Converting an empty string into a non-empty prefix of word2 requires inserting every character.

Therefore:

dp[0][j] = j

Process each pair of characters.

Iterate through every position (i, j).

Compare:

word1[i - 1]
word2[j - 1]

We subtract one because the DP table includes an extra row and column for empty prefixes. 5. Handle matching characters.

If the characters are equal, no new operation is required.

Therefore:

dp[i][j] = dp[i - 1][j - 1]

Handle different characters.

If the characters differ, consider all three operations.

Insert:

dp[i][j - 1] + 1

Delete:

dp[i - 1][j] + 1

Replace:

dp[i - 1][j - 1] + 1

Take the minimum of the three. 7. Return the final answer.

The bottom-right cell contains the minimum edit distance for the full strings:

dp[m][n]

Why it works

The algorithm works because every valid transformation must end in exactly one of four situations:

The final characters already match
The final operation was an insertion
The final operation was a deletion
The final operation was a replacement

Each case reduces the problem into a smaller subproblem whose optimal answer has already been computed. Since the DP table stores optimal answers for all prefixes, every larger state is built from correct smaller states. This guarantees that the final result is optimal.

Python Solution

class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        m = len(word1)
        n = len(word2)

        dp = [[0] * (n + 1) for _ in range(m + 1)]

        for i in range(m + 1):
            dp[i][0] = i

        for j in range(n + 1):
            dp[0][j] = j

        for i in range(1, m + 1):
            for j in range(1, n + 1):
                if word1[i - 1] == word2[j - 1]:
                    dp[i][j] = dp[i - 1][j - 1]
                else:
                    insert_cost = dp[i][j - 1]
                    delete_cost = dp[i - 1][j]
                    replace_cost = dp[i - 1][j - 1]

                    dp[i][j] = 1 + min(
                        insert_cost,
                        delete_cost,
                        replace_cost
                    )

        return dp[m][n]

The implementation starts by computing the lengths of both strings. These lengths determine the dimensions of the DP table.

The table is initialized with zeros, then the first row and first column are filled to represent transformations involving empty strings. This establishes the base cases.

The nested loops process every pair of prefixes. At each cell, the algorithm compares the current characters from both strings.

If the characters match, the current state inherits the answer from the diagonal cell because no edit is required.

If the characters differ, the algorithm computes the cost of insertion, deletion, and replacement. Each operation corresponds to moving from a neighboring DP state. The minimum of these possibilities is selected and incremented by one for the current operation.

Finally, the bottom-right cell contains the minimum edit distance between the full strings.

Go Solution

func minDistance(word1 string, word2 string) int {
	m := len(word1)
	n := len(word2)

	dp := make([][]int, m+1)

	for i := 0; i <= m; i++ {
		dp[i] = make([]int, n+1)
	}

	for i := 0; i <= m; i++ {
		dp[i][0] = i
	}

	for j := 0; j <= n; j++ {
		dp[0][j] = j
	}

	for i := 1; i <= m; i++ {
		for j := 1; j <= n; j++ {
			if word1[i-1] == word2[j-1] {
				dp[i][j] = dp[i-1][j-1]
			} else {
				insertCost := dp[i][j-1]
				deleteCost := dp[i-1][j]
				replaceCost := dp[i-1][j-1]

				dp[i][j] = 1 + min(
					insertCost,
					deleteCost,
					replaceCost,
				)
			}
		}
	}

	return dp[m][n]
}

func min(a, b, c int) int {
	if a > b {
		a = b
	}

	if a > c {
		a = c
	}

	return a
}

The Go implementation closely mirrors the Python solution. The DP table is represented as a slice of slices.

Unlike Python, Go does not provide a built-in min function for multiple integers, so a helper function is implemented.

Strings in Go are byte slices internally. Since the problem guarantees lowercase English letters, indexing by byte is safe and correct.

Go distinguishes between nil slices and empty slices, but this problem does not require special handling because the DP table is explicitly allocated.

Worked Examples

Example 1

Input:

word1 = "horse"
word2 = "ros"

DP table dimensions:

6 x 4

Initial table:

	""	r	o	s
""	0	1	2	3
h	1
o	2
r	3
s	4
e	5

Process character comparisons step by step.

Row for `'h'`

Compare	Result
h vs r	replace, dp[1][1] = 1
h vs o	dp[1][2] = 2
h vs s	dp[1][3] = 3

Row for `'o'`

Compare	Result
o vs r	dp[2][1] = 2
o vs o	match, dp[2][2] = 1
o vs s	dp[2][3] = 2

Row for `'r'`

Compare	Result
r vs r	match, dp[3][1] = 2
r vs o	dp[3][2] = 2
r vs s	dp[3][3] = 2

Row for `'s'`

Compare	Result
s vs r	dp[4][1] = 3
s vs o	dp[4][2] = 3
s vs s	match, dp[4][3] = 2

Row for `'e'`

Compare	Result
e vs r	dp[5][1] = 4
e vs o	dp[5][2] = 4
e vs s	dp[5][3] = 3

Final answer:

dp[5][3] = 3

Example 2

Input:

word1 = "intention"
word2 = "execution"

The DP table gradually computes the minimum operations for all prefix pairs.

Key observations during execution:

Several replacements occur early because the prefixes differ substantially.
Later portions of the strings share common suffixes like "tion".
Matching suffix characters reduce additional operation costs.

Final result:

dp[9][9] = 5

One optimal sequence:

intention
inention
enention
exention
exection
execution

Total operations:

Complexity Analysis

Measure	Complexity	Explanation
Time	O(m * n)	Every DP cell is computed once
Space	O(m * n)	The DP table stores all prefix states

The algorithm iterates through all pairs of indices from the two strings. Since there are m * n states and each state is computed in constant time, the total runtime is O(m * n).

The DP table also stores m * n entries, resulting in O(m * n) space complexity.

Test Cases

solution = Solution()

assert solution.minDistance("horse", "ros") == 3  # provided example
assert solution.minDistance("intention", "execution") == 5  # provided example

assert solution.minDistance("", "") == 0  # both strings empty
assert solution.minDistance("abc", "") == 3  # delete all characters
assert solution.minDistance("", "abc") == 3  # insert all characters

assert solution.minDistance("abc", "abc") == 0  # identical strings
assert solution.minDistance("a", "b") == 1  # single replacement

assert solution.minDistance("kitten", "sitting") == 3  # classic edit distance case
assert solution.minDistance("flaw", "lawn") == 2  # mix of delete and insert

assert solution.minDistance("abcdef", "azced") == 3  # mixed operations
assert solution.minDistance("aaaa", "bbbb") == 4  # all replacements

assert solution.minDistance("a", "") == 1  # smallest non-empty deletion
assert solution.minDistance("", "z") == 1  # smallest non-empty insertion

assert solution.minDistance("distance", "difference") == 5  # larger mixed case

Test	Why
`"horse"` → `"ros"`	Verifies standard mixed operations
`"intention"` → `"execution"`	Tests larger dynamic programming transitions
`""` → `""`	Validates empty string handling
`"abc"` → `""`	Ensures deletion base case works
`""` → `"abc"`	Ensures insertion base case works
`"abc"` → `"abc"`	Confirms identical strings require zero operations
`"a"` → `"b"`	Tests single replacement
`"kitten"` → `"sitting"`	Classic benchmark example
`"flaw"` → `"lawn"`	Tests combined insert and delete
`"abcdef"` → `"azced"`	Verifies mixed edit choices
`"aaaa"` → `"bbbb"`	Ensures repeated replacements work
`"a"` → `""`	Smallest deletion case
`""` → `"z"`	Smallest insertion case
`"distance"` → `"difference"`	Larger realistic transformation

Edge Cases

Empty Strings

One or both input strings may be empty. This is a common source of bugs because many implementations incorrectly assume both strings contain at least one character.

If word1 is empty, the only possible way to create word2 is through insertions. Similarly, if word2 is empty, the only possible operation is deletion.

The implementation handles this correctly through the initialization of the first row and first column:

dp[i][0] = i
dp[0][j] = j

These base cases ensure empty-prefix transformations are computed correctly.

Identical Strings

If both strings are exactly the same, the answer should be zero.

A buggy implementation might still perform unnecessary operations if character matching is not handled correctly.

The algorithm avoids this problem with the condition:

if word1[i - 1] == word2[j - 1]

When characters match, the DP state directly inherits the diagonal value without adding any operation cost.

Completely Different Strings

Consider inputs like:

word1 = "aaaa"
word2 = "bbbb"

Every character must be replaced.

This case stresses whether the algorithm properly evaluates replacement operations rather than performing inefficient insert-delete combinations.

The recurrence relation explicitly includes replacement:

dp[i - 1][j - 1] + 1

Taking the minimum among insert, delete, and replace guarantees the optimal choice is selected.

LeetCode 72 - Edit Distance

Solution

Problem Understanding

Approaches

Brute Force Approach

Optimal Dynamic Programming Approach

Algorithm Walkthrough

Dynamic Programming Definition

Step-by-Step Algorithm

Why it works

Python Solution

Go Solution

Worked Examples

Example 1

Row for 'h'

Row for 'o'

Row for 'r'

Row for 's'

Row for 'e'

Example 2

Complexity Analysis

Test Cases

Edge Cases

Empty Strings

Identical Strings

Completely Different Strings

Row for `'h'`

Row for `'o'`

Row for `'r'`

Row for `'s'`

Row for `'e'`