LeetCode 2060 - Check if an Original String Exists Given Two Encoded Strings

LeetCode Problem 2060

Difficulty: 🔴 Hard
Topics: String, Dynamic Programming

Solution

Problem Understanding

The problem is asking us to determine whether there exists a single original string that could have been encoded into two different given strings, s1 and s2. Each encoded string may contain letters and digits. The digits in an encoded string represent the length of some substring(s) in the original string, while letters represent literal characters from the original string.

The input consists of two strings s1 and s2 of length at most 40. Each string contains lowercase letters and digits 1-9. Consecutive digits are limited to at most 3, meaning numbers representing lengths range from 1 to 999. The expected output is a boolean: true if there exists an original string that can be encoded into both s1 and s2, and false otherwise.

Important points and edge cases:

Digits in the encoded string may represent any contiguous substring length from the original string.
Letters must match exactly in the original string at their corresponding positions.
Consecutive digits up to 3 mean we may need to interpret multi-digit numbers (e.g., "12" as length 12).
A naive solution generating all possible decoded strings would explode combinatorially.

Edge cases to consider:

Encoded strings consisting only of letters should match exactly.
Encoded strings with only numbers could represent a wide variety of original strings.
Leading characters mismatch (e.g., s1 = "a5b", s2 = "c5b") should return false.

Approaches

Brute Force: One could try to generate all possible original strings from s1 and s2 by expanding each numeric segment into all possible substrings of that length, then check for intersection. While correct, the combinatorial explosion makes this approach infeasible, since a string of length 40 can produce an astronomical number of possible original strings, especially when numbers in the encoded strings can represent multiple substring lengths.

Optimal Approach: The key insight is to treat the problem as a state-space search with memoization, using dynamic programming. We do not need to explicitly generate all original strings. Instead, we keep track of positions in s1 and s2 along with a difference counter that represents how many extra characters are left to match due to digits in one string versus the other. This reduces the problem to exploring a 3D state space: dfs(i, j, diff), where i is the position in s1, j is the position in s2, and diff is the length difference between pending numeric segments. Memoization avoids recomputation of overlapping subproblems.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(?)	O(?)	Enumerate all decoded original strings for s1 and s2, then check for intersection. Infeasible due to combinatorial explosion.
Optimal	O(L1 * L2 * D)	O(L1 * L2 * D)	DFS with memoization where L1 and L2 are string lengths and D is the range of possible length differences. Efficient for strings up to length 40.

Algorithm Walkthrough

Define a recursive DFS function dfs(i, j, diff) where i is the index in s1, j is the index in s2, and diff is the pending length difference (positive if s1 has extra letters to match, negative if s2 has extra letters to match).
Base Case: If both i and j have reached the end of their strings and diff is zero, return true. This indicates that both encoded strings could represent the same original string.
If the current state (i, j, diff) is already visited, return the stored result to avoid recomputation.
Numeric Expansion: If s1[i] or s2[j] is a digit, try expanding all numbers formed by up to three consecutive digits, updating the diff accordingly. For s1, increase diff; for s2, decrease diff.
Letter Matching: If diff is zero and both s1[i] and s2[j] are letters, they must match exactly. If they match, recursively call dfs(i+1, j+1, diff).
Consume Differences: If diff is positive, we can consume a character from s2 without moving i. If diff is negative, we can consume a character from s1 without moving j. Adjust diff accordingly.
Use memoization to store results for (i, j, diff) to avoid recomputation.

Why it works: The diff variable captures the imbalance between the number of letters to match due to numeric encodings. By exploring all valid numeric expansions and letter matches recursively with memoization, we efficiently traverse all feasible ways s1 and s2 could correspond to a common original string without generating them explicitly.

Python Solution

from functools import lru_cache

class Solution:
    def possiblyEquals(self, s1: str, s2: str) -> bool:
        n1, n2 = len(s1), len(s2)

        @lru_cache(None)
        def dfs(i, j, diff):
            if i == n1 and j == n2:
                return diff == 0

            # s1 digit expansion
            if i < n1 and s1[i].isdigit():
                num = 0
                for k in range(i, min(i+3, n1)):
                    if s1[k].isdigit():
                        num = num * 10 + int(s1[k])
                        if dfs(k+1, j, diff + num):
                            return True
                    else:
                        break

            # s2 digit expansion
            if j < n2 and s2[j].isdigit():
                num = 0
                for k in range(j, min(j+3, n2)):
                    if s2[k].isdigit():
                        num = num * 10 + int(s2[k])
                        if dfs(i, k+1, diff - num):
                            return True
                    else:
                        break

            # matching letters
            if diff == 0 and i < n1 and j < n2 and s1[i] == s2[j]:
                if dfs(i+1, j+1, diff):
                    return True

            # consume differences
            if diff > 0 and j < n2:
                if dfs(i, j+1, diff - 1):
                    return True
            if diff < 0 and i < n1:
                if dfs(i+1, j, diff + 1):
                    return True

            return False

        return dfs(0, 0, 0)

This Python implementation uses a recursive DFS with memoization. The key parts are numeric expansion loops for both strings, handling letter matching only when diff is zero, and consuming the diff imbalance when one string has extra letters to account for. The lru_cache ensures that repeated states are computed once, reducing complexity significantly.

Go Solution

func possiblyEquals(s1 string, s2 string) bool {
    type state struct {
        i, j, diff int
    }
    n1, n2 := len(s1), len(s2)
    memo := make(map[state]bool)

    var dfs func(i, j, diff int) bool
    dfs = func(i, j, diff int) bool {
        st := state{i, j, diff}
        if val, ok := memo[st]; ok {
            return val
        }
        if i == n1 && j == n2 {
            return diff == 0
        }

        // s1 digit expansion
        if i < n1 && s1[i] >= '0' && s1[i] <= '9' {
            num := 0
            for k := i; k < n1 && k < i+3 && s1[k] >= '0' && s1[k] <= '9'; k++ {
                num = num*10 + int(s1[k]-'0')
                if dfs(k+1, j, diff+num) {
                    memo[st] = true
                    return true
                }
            }
        }

        // s2 digit expansion
        if j < n2 && s2[j] >= '0' && s2[j] <= '9' {
            num := 0
            for k := j; k < n2 && k < j+3 && s2[k] >= '0' && s2[k] <= '9'; k++ {
                num = num*10 + int(s2[k]-'0')
                if dfs(i, k+1, diff-num) {
                    memo[st] = true
                    return true
                }
            }
        }

        // matching letters
        if diff == 0 && i < n1 && j < n2 && s1[i] == s2[j] {
            if dfs(i+1, j+1, diff) {
                memo[st] = true
                return true
            }
        }

        // consume differences
        if diff > 0 && j < n2 {
            if dfs(i, j+1, diff-1) {
                memo[st] = true
                return true
            }
        }
        if diff < 0 && i < n1 {