LeetCode 1202 - Smallest String With Swaps
The problem gives us a string s and a list of index pairs called pairs. Each pair [a, b] means we are allowed to swap the characters at positions a and b. The important detail is that swaps can be performed any number of times. This changes the nature of the problem completely.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Depth-First Search, Breadth-First Search, Union-Find, Sorting
Solution
Problem Understanding
The problem gives us a string s and a list of index pairs called pairs. Each pair [a, b] means we are allowed to swap the characters at positions a and b.
The important detail is that swaps can be performed any number of times. This changes the nature of the problem completely. We are not restricted to using each pair once. Instead, if index 0 can swap with 1, and 1 can swap with 2, then index 0 can effectively exchange characters with index 2 through a sequence of swaps.
The problem asks us to return the lexicographically smallest string that can be formed after performing any valid sequence of swaps.
Lexicographical order works like dictionary order. For example:
"abcd"is smaller than"abdc""abc"is smaller than"acb"
So the goal is to make the earliest characters in the string as small as possible.
The constraints are large:
s.lengthcan be up to10^5pairs.lengthcan also be up to10^5
These limits immediately tell us that expensive repeated swapping or exhaustive search is impossible. Any algorithm worse than roughly O(n log n) will likely time out.
A crucial observation is that swaps create connected groups of indices. If two indices are connected directly or indirectly through swap pairs, then any character inside that connected component can eventually move to any position within the same component.
For example:
pairs = [[0,1],[1,2]]
This means indices 0, 1, and 2 all belong to the same connected component. Therefore, the characters at those positions can be rearranged arbitrarily among those indices.
Important edge cases include:
- An empty
pairsarray, no swaps are possible - A fully connected graph, the entire string can be globally sorted
- Multiple disconnected components
- Duplicate swap pairs
- Single-character strings
- Components containing only one index
The problem guarantees that all indices are valid and that the string only contains lowercase English letters.
Approaches
Brute Force Approach
A brute force solution would repeatedly attempt swaps and try to improve the string until no better arrangement can be found.
One possible brute force strategy is:
- Generate all reachable strings using BFS or DFS over swap operations
- Track visited configurations
- Return the smallest lexicographical string encountered
This approach is technically correct because it explores all possible valid swap sequences. However, it is computationally infeasible.
If a component contains k indices, then potentially k! permutations of characters are reachable. Even for modest values of k, this becomes enormous.
Another brute force variation would repeatedly apply beneficial swaps greedily. However, greedy local swaps do not necessarily lead to the globally smallest arrangement unless we understand the connected-component structure.
Because the constraints are up to 10^5, brute force exploration is completely impractical.
Key Insight
The critical insight is that swap relationships form connected components in a graph.
If two indices belong to the same connected component, then their characters can be rearranged arbitrarily through sequences of swaps.
This transforms the problem into:
- Find all connected components of indices
- Collect all characters belonging to each component
- Sort the characters in ascending order
- Place the smallest characters into the smallest indices
Why does this work?
To minimize the lexicographical order, earlier positions should receive the smallest possible characters. Since all indices inside a connected component are freely interchangeable, we should greedily assign the smallest available character to the smallest index.
Union-Find, also called Disjoint Set Union (DSU), is ideal for efficiently building connected components.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | Exponential | Exponential | Explores many possible swap sequences |
| Optimal | O(n log n) | O(n) | Uses Union-Find to group connected indices |
Algorithm Walkthrough
Optimal Algorithm Using Union-Find
- Initialize a Union-Find data structure.
Each index initially belongs to its own set. The Union-Find structure efficiently merges connected indices and determines which component an index belongs to. 2. Process every swap pair.
For each pair [a, b], union the two indices. This builds connected components representing all positions that can exchange characters.
3. Group indices by their component root.
After all unions are complete, every connected component has a representative root. We iterate through all indices and group them by their root.
For example:
Component root 0 -> [0, 2, 3]
Component root 5 -> [5, 7]
- Collect the characters belonging to each component.
For every group of indices, extract the corresponding characters from the string.
Example:
indices = [0, 2, 3]
chars = ['d', 'a', 'b']
- Sort both indices and characters.
The indices are sorted so we fill earlier positions first.
The characters are sorted so the smallest available characters are assigned first.
Example:
indices = [0, 2, 3]
chars = ['a', 'b', 'd']
- Rebuild the result string.
Assign the sorted characters back to the sorted indices.
result[0] = 'a'
result[2] = 'b'
result[3] = 'd'
- Return the final string.
After processing all components, join the result array into a string.
Why it works
Within a connected component, every character can eventually move to any index in that component. Therefore, the lexicographically smallest arrangement is obtained by sorting the component's characters and placing them into the component's indices in ascending order.
This greedy placement is optimal because lexicographical comparison prioritizes earlier positions before later ones.
Python Solution
from collections import defaultdict
from typing import List
class Solution:
def smallestStringWithSwaps(self, s: str, pairs: List[List[int]]) -> str:
n = len(s)
parent = list(range(n))
rank = [0] * n
def find(x: int) -> int:
if parent[x] != x:
parent[x] = find(parent[x])
return parent[x]
def union(x: int, y: int) -> None:
root_x = find(x)
root_y = find(y)
if root_x == root_y:
return
if rank[root_x] < rank[root_y]:
parent[root_x] = root_y
elif rank[root_x] > rank[root_y]:
parent[root_y] = root_x
else:
parent[root_y] = root_x
rank[root_x] += 1
# Build connected components
for a, b in pairs:
union(a, b)
# Group indices by component root
components = defaultdict(list)
for index in range(n):
root = find(index)
components[root].append(index)
result = list(s)
# Sort characters within each component
for indices in components.values():
chars = sorted(s[index] for index in indices)
indices.sort()
for index, char in zip(indices, chars):
result[index] = char
return "".join(result)
The implementation begins by initializing the Union-Find structure. Each index starts as its own parent because initially no connections are known.
The find function uses path compression. This optimization flattens the tree structure so future lookups become extremely fast.
The union function merges two components using union by rank. This keeps the trees shallow and improves performance.
After processing all pairs, every connected component is fully established.
Next, we group indices according to their component root using a hash map. Each key represents one connected component.
For every component:
- We collect its characters
- Sort the characters
- Sort the indices
- Assign the smallest characters to the smallest indices
Finally, we join the result list into a string and return it.
Go Solution
package main
import (
"sort"
)
func smallestStringWithSwaps(s string, pairs [][]int) string {
n := len(s)
parent := make([]int, n)
rank := make([]int, n)
for i := 0; i < n; i++ {
parent[i] = i
}
var find func(int) int
find = func(x int) int {
if parent[x] != x {
parent[x] = find(parent[x])
}
return parent[x]
}
union := func(x, y int) {
rootX := find(x)
rootY := find(y)
if rootX == rootY {
return
}
if rank[rootX] < rank[rootY] {
parent[rootX] = rootY
} else if rank[rootX] > rank[rootY] {
parent[rootY] = rootX
} else {
parent[rootY] = rootX
rank[rootX]++
}
}
// Build connected components
for _, pair := range pairs {
union(pair[0], pair[1])
}
components := make(map[int][]int)
// Group indices by root
for i := 0; i < n; i++ {
root := find(i)
components[root] = append(components[root], i)
}
result := []byte(s)
// Process each component
for _, indices := range components {
chars := make([]byte, len(indices))
for i, index := range indices {
chars[i] = s[index]
}
sort.Ints(indices)
sort.Slice(chars, func(i, j int) bool {
return chars[i] < chars[j]
})
for i, index := range indices {
result[index] = chars[i]
}
}
return string(result)
}
The Go implementation follows the same logic as the Python solution.
One notable difference is that Go strings are immutable, so we convert the string into a []byte array before modifying characters.
The component map uses map[int][]int to group indices by root.
Sorting characters requires sort.Slice because Go does not directly sort byte slices lexicographically with a dedicated helper.
Go does not require special handling for integer overflow here because all indices are bounded by 10^5.
Worked Examples
Example 1
s = "dcab"
pairs = [[0,3],[1,2]]
Step 1: Build Components
| Pair | Resulting Components |
|---|---|
| [0,3] | {0,3} |
| [1,2] | {1,2} |
Final components:
{0,3}
{1,2}
Step 2: Process Component {0,3}
| Indices | Characters |
|---|---|
| [0,3] | ['d','b'] |
Sort characters:
['b','d']
Assign back:
| Index | Character |
|---|---|
| 0 | b |
| 3 | d |
Current result:
bcad
Step 3: Process Component {1,2}
| Indices | Characters |
|---|---|
| [1,2] | ['c','a'] |
Sort characters:
['a','c']
Assign back:
| Index | Character |
|---|---|
| 1 | a |
| 2 | c |
Final result:
bacd
Example 2
s = "dcab"
pairs = [[0,3],[1,2],[0,2]]
Step 1: Build Components
The unions connect all indices together.
0 <-> 3
0 <-> 2
1 <-> 2
Final component:
{0,1,2,3}
Step 2: Collect Characters
| Indices | Characters |
|---|---|
| [0,1,2,3] | ['d','c','a','b'] |
Sort characters:
['a','b','c','d']
Assign to sorted indices:
| Index | Character |
|---|---|
| 0 | a |
| 1 | b |
| 2 | c |
| 3 | d |
Final result:
abcd
Example 3
s = "cba"
pairs = [[0,1],[1,2]]
Step 1: Build Components
All indices become connected.
{0,1,2}
Step 2: Sort Characters
| Original | Sorted |
|---|---|
| ['c','b','a'] | ['a','b','c'] |
Step 3: Assign Back
| Index | Character |
|---|---|
| 0 | a |
| 1 | b |
| 2 | c |
Final result:
abc
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Sorting characters inside components dominates runtime |
| Space | O(n) | Union-Find structures and component storage |
The Union-Find operations are nearly constant time because of path compression and union by rank.
The dominant cost comes from sorting characters within connected components. Across all components, the total number of characters sorted is n, giving an overall complexity of O(n log n) in the worst case.
The space usage is linear because we store:
- Parent and rank arrays
- Component mappings
- The result array
Test Cases
from typing import List
class Solution:
def smallestStringWithSwaps(self, s: str, pairs: List[List[int]]) -> str:
n = len(s)
parent = list(range(n))
rank = [0] * n
def find(x):
if parent[x] != x:
parent[x] = find(parent[x])
return parent[x]
def union(x, y):
root_x = find(x)
root_y = find(y)
if root_x == root_y:
return
if rank[root_x] < rank[root_y]:
parent[root_x] = root_y
elif rank[root_x] > rank[root_y]:
parent[root_y] = root_x
else:
parent[root_y] = root_x
rank[root_x] += 1
for a, b in pairs:
union(a, b)
from collections import defaultdict
components = defaultdict(list)
for i in range(n):
components[find(i)].append(i)
result = list(s)
for indices in components.values():
chars = sorted(s[i] for i in indices)
indices.sort()
for i, ch in zip(indices, chars):
result[i] = ch
return "".join(result)
sol = Solution()
assert sol.smallestStringWithSwaps(
"dcab",
[[0, 3], [1, 2]]
) == "bacd" # basic disconnected components
assert sol.smallestStringWithSwaps(
"dcab",
[[0, 3], [1, 2], [0, 2]]
) == "abcd" # fully connected graph
assert sol.smallestStringWithSwaps(
"cba",
[[0, 1], [1, 2]]
) == "abc" # chain connectivity
assert sol.smallestStringWithSwaps(
"zxy",
[]
) == "zxy" # no swaps allowed
assert sol.smallestStringWithSwaps(
"a",
[]
) == "a" # single character string
assert sol.smallestStringWithSwaps(
"bca",
[[0, 1]]
) == "bac" # partial component
assert sol.smallestStringWithSwaps(
"dcabef",
[[0, 3], [1, 2]]
) == "bacdef" # multiple untouched characters
assert sol.smallestStringWithSwaps(
"gfedcba",
[[0,1],[1,2],[2,3],[3,4],[4,5],[5,6]]
) == "abcdefg" # large connected chain
assert sol.smallestStringWithSwaps(
"aaaa",
[[0,1],[1,2]]
) == "aaaa" # repeated characters
assert sol.smallestStringWithSwaps(
"dcab",
[[0,3],[0,3],[1,2]]
) == "bacd" # duplicate pairs
Test Case Summary
| Test | Why |
|---|---|
"dcab" with two components |
Validates independent connected groups |
"dcab" fully connected |
Ensures full sorting works |
| Chain connectivity | Confirms transitive swaps |
| Empty pairs | No changes should occur |
| Single character | Smallest possible input |
| Partial component | Only some indices should change |
| Untouched suffix | Verifies isolated indices remain unchanged |
| Long connected chain | Stress test for large components |
| Repeated characters | Ensures duplicates handled correctly |
| Duplicate pairs | Confirms redundant unions are harmless |
Edge Cases
Empty Swap List
If pairs is empty, no swaps are possible. A naive implementation might still attempt unnecessary processing or fail when constructing components.
The current implementation handles this naturally. Every index remains in its own singleton component, so the original string is returned unchanged.
Fully Connected String
If all indices belong to one connected component, then the entire string can be freely rearranged.
This is effectively equivalent to sorting the entire string lexicographically. The algorithm handles this correctly because all indices map to the same root, producing one large component.
Duplicate or Redundant Pairs
Input may contain repeated pairs or pairs that connect already-connected indices.
For example:
[[0,1],[1,2],[0,2]]
A buggy implementation might repeatedly rebuild structures inefficiently.
Union-Find handles this safely because attempting to union two indices already in the same set becomes a no-op.
Components of Size One
Some indices may never appear in any pair.
These indices form singleton components and their characters cannot move.
The algorithm correctly preserves those characters because sorting a single-element component changes nothing.
Repeated Characters
Strings may contain many identical characters.
For example:
"aaaa"
Some incorrect implementations accidentally rely on character uniqueness.
This solution does not assume uniqueness. It simply sorts character arrays, which works correctly even with duplicates.