LeetCode 1061 - Lexicographically Smallest Equivalent String

Difficulty: 🟡 Medium
Topics: String, Union-Find

Solution

Here is the complete, detailed technical solution guide for LeetCode 1061 following your exact formatting requirements.

Problem Understanding

The problem requires us to find the lexicographically smallest string that can be obtained from a given baseStr using character equivalencies defined by two strings, s1 and s2. Each position i in s1 and s2 represents an equivalence relationship between s1[i] and s2[i]. These equivalences obey the rules of reflexivity, symmetry, and transitivity, meaning that if characters are indirectly connected through other equivalences, they are considered equivalent.

The input strings s1 and s2 have the same length, and all strings contain only lowercase English letters. The constraints allow up to 1000 characters in each string, so any solution must efficiently handle union operations among up to 26 letters. The output is a transformed version of baseStr where each character is replaced by the lexicographically smallest character in its equivalence group.

Key edge cases include situations where s1 and s2 define overlapping equivalence classes, baseStr contains characters not present in s1 or s2, or equivalence groups collapse multiple characters to a single representative.

Approaches

The brute-force approach would involve generating all possible equivalent strings of baseStr by applying equivalences exhaustively and then picking the lexicographically smallest string. This is impractical because the number of equivalent strings grows exponentially with the number of equivalences, leading to unacceptable time complexity.

The optimal approach leverages a Union-Find (Disjoint Set Union) data structure to manage equivalence classes efficiently. By maintaining a parent pointer for each character, we can merge sets of equivalent characters and always ensure that the representative of each set is the lexicographically smallest character. Then, transforming baseStr simply involves replacing each character with the smallest representative from its equivalence class. This approach is efficient because it uses path compression and union-by-rank heuristics, yielding nearly constant time operations for union and find.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(2^n)	O(n)	Generate all equivalent strings, impractical for n up to 1000
Optimal	O(n + m * α(26)) ≈ O(n + m)	O(26)	Use Union-Find to maintain equivalence classes efficiently

n is the length of baseStr, and m is the length of s1/s2. α is the inverse Ackermann function, which is effectively constant for small alphabets.

Algorithm Walkthrough

Initialize a Union-Find structure for 26 lowercase English letters. Each letter is initially its own parent.
Iterate through pairs (s1[i], s2[i]). For each pair, perform a union operation. When merging, always ensure the lexicographically smaller character becomes the parent. This guarantees that each equivalence class’s representative is the smallest character.
Once all equivalences are processed, iterate over each character c in baseStr. Use the find operation to retrieve the representative of its equivalence class.
Construct a new string by replacing each character in baseStr with its smallest representative.
Return the resulting string.

Why it works: The Union-Find structure maintains disjoint sets of equivalent characters and ensures each set has the smallest character as its representative. By transforming baseStr using these representatives, we obtain the lexicographically smallest equivalent string without having to generate all possible permutations.

Python Solution

class Solution:
    def smallestEquivalentString(self, s1: str, s2: str, baseStr: str) -> str:
        parent = {chr(i): chr(i) for i in range(ord('a'), ord('z') + 1)}
        
        def find(c):
            if parent[c] != c:
                parent[c] = find(parent[c])
            return parent[c]
        
        def union(c1, c2):
            p1, p2 = find(c1), find(c2)
            if p1 == p2:
                return
            if p1 < p2:
                parent[p2] = p1
            else:
                parent[p1] = p2
        
        for a, b in zip(s1, s2):
            union(a, b)
        
        return ''.join(find(c) for c in baseStr)

Explanation: We first initialize each letter as its own parent. The find function uses path compression for efficiency. The union function ensures the smallest character remains the parent when two equivalence classes merge. Finally, we transform baseStr by replacing each character with its representative.

Go Solution

func smallestEquivalentString(s1 string, s2 string, baseStr string) string {
    parent := make([]byte, 26)
    for i := 0; i < 26; i++ {
        parent[i] = byte(i) + 'a'
    }

    var find func(byte) byte
    find = func(c byte) byte {
        if parent[c-'a'] != c {
            parent[c-'a'] = find(parent[c-'a'])
        }
        return parent[c-'a']
    }

    union := func(a, b byte) {
        pa, pb := find(a), find(b)
        if pa == pb {
            return
        }
        if pa < pb {
            parent[pb-'a'] = pa
        } else {
            parent[pa-'a'] = pb
        }
    }

    for i := 0; i < len(s1); i++ {
        union(s1[i], s2[i])
    }

    res := make([]byte, len(baseStr))
    for i := 0; i < len(baseStr); i++ {
        res[i] = find(baseStr[i])
    }
    return string(res)
}

Go-specific notes: We use a fixed-size array for 26 letters instead of a map. Slice indexing is zero-based. Path compression and union logic are similar to Python. Conversion between byte and array indices uses subtraction of 'a'.

Worked Examples

Example 1: s1 = "parker", s2 = "morris", baseStr = "parser"

Pair	Union Result	Parent Mapping (abbreviated)
p-m	merge → 'm' parent of 'p'	m->m, p->m
a-o	merge → 'a' parent of 'o'	a->a, o->a
r-r	already same	-
k-r	merge → 'k' < 'r' → k parent	k->k, r->k
e-i	merge → 'e' < 'i' → e parent	e->e, i->e
r-s	merge → 'k' < 's' → k parent	s->k

Transform parser using parents: p->m, a->a, r->k, s->k, e->e, r->k → "makkek"

Example 2: s1 = "hello", s2 = "world", baseStr = "hold"

Parent updates lead to: h->h, e->d, l->d, o->d, w->h, r->l, d->d

Transform hold: h->h, o->d, l->d, d->d → "hdld"

Example 3: s1 = "leetcode", s2 = "programs", baseStr = "sourcecode"

Parent updates lead to representative mappings: a->a, o->a, e->a, r->a, s->a, c->a, l->l, p->l, g->g, t->g, d->d, m->d, u->u

Transform sourcecode → "aauaaaaada"

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n + m)	n = len(baseStr), m = len(s1) = len(s2); each union/find is near constant
Space	O(26)	Only store parent for 26 letters

The algorithm is efficient because the Union-Find operations are practically constant for 26 elements, making the total runtime linear in the size of the input strings.

Test Cases

# Provided examples
assert Solution().smallestEquivalentString("parker", "morris", "parser") == "makkek"  # Example 1
assert Solution().smallestEquivalentString("hello", "world", "hold") == "hdld"      # Example 2
assert Solution().smallestEquivalentString("leetcode", "programs", "sourcecode") == "aauaaaaada"  # Example 3

# Edge cases
assert Solution().smallestEquivalentString("a", "b", "c") == "c"  # baseStr not in equivalences
assert Solution().smallestEquivalentString("abc", "def", "fed") == "adb"  # multiple independent classes
assert Solution().smallestEquivalentString("aaa", "bbb", "aaa") == "aaa"  # identical characters

# Maximum length stress
s1 = "abcdefghijklmnopqrstuvwxyz" * 38
s2 = "bcdefghijklmnopqrstuvwxyza