LeetCode 1202 - Smallest String With Swaps

The problem gives us a string s and a list of index pairs called pairs. Each pair [a, b] means we are allowed to swap the characters at positions a and b. The important detail is that swaps can be performed any number of times. This changes the nature of the problem completely.

LeetCode Problem 1202

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Depth-First Search, Breadth-First Search, Union-Find, Sorting

Solution

Problem Understanding

The problem gives us a string s and a list of index pairs called pairs. Each pair [a, b] means we are allowed to swap the characters at positions a and b.

The important detail is that swaps can be performed any number of times. This changes the nature of the problem completely. We are not restricted to using each pair once. Instead, if index 0 can swap with 1, and 1 can swap with 2, then index 0 can effectively exchange characters with index 2 through a sequence of swaps.

The problem asks us to return the lexicographically smallest string that can be formed after performing any valid sequence of swaps.

Lexicographical order works like dictionary order. For example:

  • "abcd" is smaller than "abdc"
  • "abc" is smaller than "acb"

So the goal is to make the earliest characters in the string as small as possible.

The constraints are large:

  • s.length can be up to 10^5
  • pairs.length can also be up to 10^5

These limits immediately tell us that expensive repeated swapping or exhaustive search is impossible. Any algorithm worse than roughly O(n log n) will likely time out.

A crucial observation is that swaps create connected groups of indices. If two indices are connected directly or indirectly through swap pairs, then any character inside that connected component can eventually move to any position within the same component.

For example:

pairs = [[0,1],[1,2]]

This means indices 0, 1, and 2 all belong to the same connected component. Therefore, the characters at those positions can be rearranged arbitrarily among those indices.

Important edge cases include:

  • An empty pairs array, no swaps are possible
  • A fully connected graph, the entire string can be globally sorted
  • Multiple disconnected components
  • Duplicate swap pairs
  • Single-character strings
  • Components containing only one index

The problem guarantees that all indices are valid and that the string only contains lowercase English letters.

Approaches

Brute Force Approach

A brute force solution would repeatedly attempt swaps and try to improve the string until no better arrangement can be found.

One possible brute force strategy is:

  • Generate all reachable strings using BFS or DFS over swap operations
  • Track visited configurations
  • Return the smallest lexicographical string encountered

This approach is technically correct because it explores all possible valid swap sequences. However, it is computationally infeasible.

If a component contains k indices, then potentially k! permutations of characters are reachable. Even for modest values of k, this becomes enormous.

Another brute force variation would repeatedly apply beneficial swaps greedily. However, greedy local swaps do not necessarily lead to the globally smallest arrangement unless we understand the connected-component structure.

Because the constraints are up to 10^5, brute force exploration is completely impractical.

Key Insight

The critical insight is that swap relationships form connected components in a graph.

If two indices belong to the same connected component, then their characters can be rearranged arbitrarily through sequences of swaps.

This transforms the problem into:

  1. Find all connected components of indices
  2. Collect all characters belonging to each component
  3. Sort the characters in ascending order
  4. Place the smallest characters into the smallest indices

Why does this work?

To minimize the lexicographical order, earlier positions should receive the smallest possible characters. Since all indices inside a connected component are freely interchangeable, we should greedily assign the smallest available character to the smallest index.

Union-Find, also called Disjoint Set Union (DSU), is ideal for efficiently building connected components.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force Exponential Exponential Explores many possible swap sequences
Optimal O(n log n) O(n) Uses Union-Find to group connected indices

Algorithm Walkthrough

Optimal Algorithm Using Union-Find

  1. Initialize a Union-Find data structure.

Each index initially belongs to its own set. The Union-Find structure efficiently merges connected indices and determines which component an index belongs to. 2. Process every swap pair.

For each pair [a, b], union the two indices. This builds connected components representing all positions that can exchange characters. 3. Group indices by their component root.

After all unions are complete, every connected component has a representative root. We iterate through all indices and group them by their root.

For example:

Component root 0 -> [0, 2, 3]
Component root 5 -> [5, 7]
  1. Collect the characters belonging to each component.

For every group of indices, extract the corresponding characters from the string.

Example:

indices = [0, 2, 3]
chars = ['d', 'a', 'b']
  1. Sort both indices and characters.

The indices are sorted so we fill earlier positions first.

The characters are sorted so the smallest available characters are assigned first.

Example:

indices = [0, 2, 3]
chars = ['a', 'b', 'd']
  1. Rebuild the result string.

Assign the sorted characters back to the sorted indices.

result[0] = 'a'
result[2] = 'b'
result[3] = 'd'
  1. Return the final string.

After processing all components, join the result array into a string.

Why it works

Within a connected component, every character can eventually move to any index in that component. Therefore, the lexicographically smallest arrangement is obtained by sorting the component's characters and placing them into the component's indices in ascending order.

This greedy placement is optimal because lexicographical comparison prioritizes earlier positions before later ones.

Python Solution

from collections import defaultdict
from typing import List

class Solution:
    def smallestStringWithSwaps(self, s: str, pairs: List[List[int]]) -> str:
        n = len(s)

        parent = list(range(n))
        rank = [0] * n

        def find(x: int) -> int:
            if parent[x] != x:
                parent[x] = find(parent[x])
            return parent[x]

        def union(x: int, y: int) -> None:
            root_x = find(x)
            root_y = find(y)

            if root_x == root_y:
                return

            if rank[root_x] < rank[root_y]:
                parent[root_x] = root_y
            elif rank[root_x] > rank[root_y]:
                parent[root_y] = root_x
            else:
                parent[root_y] = root_x
                rank[root_x] += 1

        # Build connected components
        for a, b in pairs:
            union(a, b)

        # Group indices by component root
        components = defaultdict(list)

        for index in range(n):
            root = find(index)
            components[root].append(index)

        result = list(s)

        # Sort characters within each component
        for indices in components.values():
            chars = sorted(s[index] for index in indices)

            indices.sort()

            for index, char in zip(indices, chars):
                result[index] = char

        return "".join(result)

The implementation begins by initializing the Union-Find structure. Each index starts as its own parent because initially no connections are known.

The find function uses path compression. This optimization flattens the tree structure so future lookups become extremely fast.

The union function merges two components using union by rank. This keeps the trees shallow and improves performance.

After processing all pairs, every connected component is fully established.

Next, we group indices according to their component root using a hash map. Each key represents one connected component.

For every component:

  • We collect its characters
  • Sort the characters
  • Sort the indices
  • Assign the smallest characters to the smallest indices

Finally, we join the result list into a string and return it.

Go Solution

package main

import (
	"sort"
)

func smallestStringWithSwaps(s string, pairs [][]int) string {
	n := len(s)

	parent := make([]int, n)
	rank := make([]int, n)

	for i := 0; i < n; i++ {
		parent[i] = i
	}

	var find func(int) int

	find = func(x int) int {
		if parent[x] != x {
			parent[x] = find(parent[x])
		}
		return parent[x]
	}

	union := func(x, y int) {
		rootX := find(x)
		rootY := find(y)

		if rootX == rootY {
			return
		}

		if rank[rootX] < rank[rootY] {
			parent[rootX] = rootY
		} else if rank[rootX] > rank[rootY] {
			parent[rootY] = rootX
		} else {
			parent[rootY] = rootX
			rank[rootX]++
		}
	}

	// Build connected components
	for _, pair := range pairs {
		union(pair[0], pair[1])
	}

	components := make(map[int][]int)

	// Group indices by root
	for i := 0; i < n; i++ {
		root := find(i)
		components[root] = append(components[root], i)
	}

	result := []byte(s)

	// Process each component
	for _, indices := range components {
		chars := make([]byte, len(indices))

		for i, index := range indices {
			chars[i] = s[index]
		}

		sort.Ints(indices)

		sort.Slice(chars, func(i, j int) bool {
			return chars[i] < chars[j]
		})

		for i, index := range indices {
			result[index] = chars[i]
		}
	}

	return string(result)
}

The Go implementation follows the same logic as the Python solution.

One notable difference is that Go strings are immutable, so we convert the string into a []byte array before modifying characters.

The component map uses map[int][]int to group indices by root.

Sorting characters requires sort.Slice because Go does not directly sort byte slices lexicographically with a dedicated helper.

Go does not require special handling for integer overflow here because all indices are bounded by 10^5.

Worked Examples

Example 1

s = "dcab"
pairs = [[0,3],[1,2]]

Step 1: Build Components

Pair Resulting Components
[0,3] {0,3}
[1,2] {1,2}

Final components:

{0,3}
{1,2}

Step 2: Process Component {0,3}

Indices Characters
[0,3] ['d','b']

Sort characters:

['b','d']

Assign back:

Index Character
0 b
3 d

Current result:

bcad

Step 3: Process Component {1,2}

Indices Characters
[1,2] ['c','a']

Sort characters:

['a','c']

Assign back:

Index Character
1 a
2 c

Final result:

bacd

Example 2

s = "dcab"
pairs = [[0,3],[1,2],[0,2]]

Step 1: Build Components

The unions connect all indices together.

0 <-> 3
0 <-> 2
1 <-> 2

Final component:

{0,1,2,3}

Step 2: Collect Characters

Indices Characters
[0,1,2,3] ['d','c','a','b']

Sort characters:

['a','b','c','d']

Assign to sorted indices:

Index Character
0 a
1 b
2 c
3 d

Final result:

abcd

Example 3

s = "cba"
pairs = [[0,1],[1,2]]

Step 1: Build Components

All indices become connected.

{0,1,2}

Step 2: Sort Characters

Original Sorted
['c','b','a'] ['a','b','c']

Step 3: Assign Back

Index Character
0 a
1 b
2 c

Final result:

abc

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Sorting characters inside components dominates runtime
Space O(n) Union-Find structures and component storage

The Union-Find operations are nearly constant time because of path compression and union by rank.

The dominant cost comes from sorting characters within connected components. Across all components, the total number of characters sorted is n, giving an overall complexity of O(n log n) in the worst case.

The space usage is linear because we store:

  • Parent and rank arrays
  • Component mappings
  • The result array

Test Cases

from typing import List

class Solution:
    def smallestStringWithSwaps(self, s: str, pairs: List[List[int]]) -> str:
        n = len(s)

        parent = list(range(n))
        rank = [0] * n

        def find(x):
            if parent[x] != x:
                parent[x] = find(parent[x])
            return parent[x]

        def union(x, y):
            root_x = find(x)
            root_y = find(y)

            if root_x == root_y:
                return

            if rank[root_x] < rank[root_y]:
                parent[root_x] = root_y
            elif rank[root_x] > rank[root_y]:
                parent[root_y] = root_x
            else:
                parent[root_y] = root_x
                rank[root_x] += 1

        for a, b in pairs:
            union(a, b)

        from collections import defaultdict

        components = defaultdict(list)

        for i in range(n):
            components[find(i)].append(i)

        result = list(s)

        for indices in components.values():
            chars = sorted(s[i] for i in indices)
            indices.sort()

            for i, ch in zip(indices, chars):
                result[i] = ch

        return "".join(result)

sol = Solution()

assert sol.smallestStringWithSwaps(
    "dcab",
    [[0, 3], [1, 2]]
) == "bacd"  # basic disconnected components

assert sol.smallestStringWithSwaps(
    "dcab",
    [[0, 3], [1, 2], [0, 2]]
) == "abcd"  # fully connected graph

assert sol.smallestStringWithSwaps(
    "cba",
    [[0, 1], [1, 2]]
) == "abc"  # chain connectivity

assert sol.smallestStringWithSwaps(
    "zxy",
    []
) == "zxy"  # no swaps allowed

assert sol.smallestStringWithSwaps(
    "a",
    []
) == "a"  # single character string

assert sol.smallestStringWithSwaps(
    "bca",
    [[0, 1]]
) == "bac"  # partial component

assert sol.smallestStringWithSwaps(
    "dcabef",
    [[0, 3], [1, 2]]
) == "bacdef"  # multiple untouched characters

assert sol.smallestStringWithSwaps(
    "gfedcba",
    [[0,1],[1,2],[2,3],[3,4],[4,5],[5,6]]
) == "abcdefg"  # large connected chain

assert sol.smallestStringWithSwaps(
    "aaaa",
    [[0,1],[1,2]]
) == "aaaa"  # repeated characters

assert sol.smallestStringWithSwaps(
    "dcab",
    [[0,3],[0,3],[1,2]]
) == "bacd"  # duplicate pairs

Test Case Summary

Test Why
"dcab" with two components Validates independent connected groups
"dcab" fully connected Ensures full sorting works
Chain connectivity Confirms transitive swaps
Empty pairs No changes should occur
Single character Smallest possible input
Partial component Only some indices should change
Untouched suffix Verifies isolated indices remain unchanged
Long connected chain Stress test for large components
Repeated characters Ensures duplicates handled correctly
Duplicate pairs Confirms redundant unions are harmless

Edge Cases

Empty Swap List

If pairs is empty, no swaps are possible. A naive implementation might still attempt unnecessary processing or fail when constructing components.

The current implementation handles this naturally. Every index remains in its own singleton component, so the original string is returned unchanged.

Fully Connected String

If all indices belong to one connected component, then the entire string can be freely rearranged.

This is effectively equivalent to sorting the entire string lexicographically. The algorithm handles this correctly because all indices map to the same root, producing one large component.

Duplicate or Redundant Pairs

Input may contain repeated pairs or pairs that connect already-connected indices.

For example:

[[0,1],[1,2],[0,2]]

A buggy implementation might repeatedly rebuild structures inefficiently.

Union-Find handles this safely because attempting to union two indices already in the same set becomes a no-op.

Components of Size One

Some indices may never appear in any pair.

These indices form singleton components and their characters cannot move.

The algorithm correctly preserves those characters because sorting a single-element component changes nothing.

Repeated Characters

Strings may contain many identical characters.

For example:

"aaaa"

Some incorrect implementations accidentally rely on character uniqueness.

This solution does not assume uniqueness. It simply sorts character arrays, which works correctly even with duplicates.