LeetCode 854 - K-Similar Strings
The problem asks us to determine the minimum number of swaps needed to transform one string into another, where both strings are anagrams of each other. A single operation consists of swapping any two characters in s1.
Difficulty: 🔴 Hard
Topics: Hash Table, String, Breadth-First Search
Solution
Problem Understanding
The problem asks us to determine the minimum number of swaps needed to transform one string into another, where both strings are anagrams of each other. A single operation consists of swapping any two characters in s1. The goal is to make s1 exactly equal to s2 using as few swaps as possible.
Because the strings are anagrams, they contain exactly the same characters with the same frequencies. This guarantees that a valid transformation always exists. The challenge is not whether the transformation is possible, but rather how to achieve it with the smallest number of swaps.
The input consists of two strings:
s1, the starting strings2, the target string
The expected output is a single integer representing the minimum number of swaps required.
The constraints are important:
- The maximum string length is only 20
- Characters come from only six possible letters:
athroughf
A length of 20 is small enough that exponential algorithms may still work if they are heavily pruned, but far too large for naive permutation generation. The limited alphabet also means many repeated characters may appear, which can create many equivalent swap states.
Several edge cases are important:
- The strings may already be equal, in which case the answer is
0 - Multiple identical characters can create many redundant swap choices
- Some swaps may fix multiple mismatches at once
- Greedy local choices do not always lead to the optimal global solution
This problem is fundamentally a shortest path search problem over string states.
Approaches
Brute Force Approach
A brute force solution would try every possible sequence of swaps until s1 becomes s2. At each step, we could generate all strings obtainable by swapping every pair of indices.
This approach is correct because it eventually explores every reachable configuration, so it must eventually discover the minimum number of swaps.
However, the number of possible states grows explosively. A string of length n has up to n! permutations, and each state can generate O(n²) new states from swaps. Even with n = 20, this is completely infeasible.
The brute force method becomes too slow because it wastes time exploring states that clearly move away from the target or repeat already visited configurations.
Optimal Approach, Breadth-First Search with Pruning
The key insight is that we only care about swaps that immediately fix a mismatch.
Suppose we scan the current string and find the first index i where it differs from s2. Any optimal solution must eventually place the correct character s2[i] into this position.
Therefore, instead of trying all swaps, we only swap index i with positions j where:
current[j] == s2[i]current[j] != s2[j]
This dramatically reduces branching.
We then use Breadth-First Search (BFS):
- Each node is a string configuration
- Each edge represents one swap
- BFS guarantees the first time we reach
s2, we used the minimum number of swaps
Because BFS explores states level by level, the first successful transformation is optimal.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n! × n²) | O(n!) | Explores nearly all permutations |
| Optimal BFS with Pruning | O(n! / pruning) | O(n!) | Only explores useful swaps |
The optimal approach still has exponential worst case behavior, but the pruning strategy reduces the practical search space dramatically enough to pass the constraints.
Algorithm Walkthrough
- Start by checking whether
s1already equalss2. If so, return0immediately because no swaps are needed. - Initialize a BFS queue with the starting string
s1and a swap count of0. - Create a
visitedset to avoid revisiting the same string configuration multiple times. This prevents infinite loops and redundant work. - Repeatedly remove the front state from the queue.
- For the current string, scan from left to right and find the first index
iwhere the current string differs froms2. - The character at position
imust eventually becomes2[i]. Therefore, search for indicesj > isuch that:
current[j] == s2[i]current[j] != s2[j]
The second condition avoids breaking characters that are already correct.
7. For every valid j, create a new string by swapping characters at positions i and j.
8. If the new string equals s2, return the current swap count plus one.
9. Otherwise, if the new string has not been visited before:
- Add it to
visited - Push it into the BFS queue with swap count incremented by one
- Continue until BFS reaches the target string.
Why it works
At every BFS level, all states represent strings reachable with the same number of swaps. Since BFS explores level by level, the first time we encounter s2 must correspond to the minimum number of swaps.
The pruning rule is also safe. Any optimal solution must eventually fix the first mismatched position, so only considering swaps that place the correct character there cannot exclude the optimal path.
Python Solution
from collections import deque
from typing import Set
class Solution:
def kSimilarity(self, s1: str, s2: str) -> int:
if s1 == s2:
return 0
queue = deque([(s1, 0)])
visited: Set[str] = {s1}
while queue:
current, swaps = queue.popleft()
i = 0
while current[i] == s2[i]:
i += 1
current_list = list(current)
for j in range(i + 1, len(current)):
if (
current[j] == s2[i]
and current[j] != s2[j]
):
next_list = current_list[:]
next_list[i], next_list[j] = (
next_list[j],
next_list[i],
)
next_string = "".join(next_list)
if next_string == s2:
return swaps + 1
if next_string not in visited:
visited.add(next_string)
queue.append((next_string, swaps + 1))
return -1
The implementation begins with a quick equality check. This handles the simplest case efficiently.
The BFS queue stores tuples containing:
- The current string state
- The number of swaps used to reach it
The visited set prevents duplicate exploration.
Inside the BFS loop, the algorithm finds the first mismatched index i. This is the key pruning optimization. Instead of trying arbitrary swaps, we only attempt swaps that place the correct character into position i.
The string is temporarily converted into a list because Python strings are immutable. After performing a swap, the list is converted back into a string.
Whenever a new valid state is generated, it is either returned immediately if it matches s2, or added to the BFS frontier for further exploration.
Go Solution
package main
import (
"container/list"
)
type State struct {
str string
swaps int
}
func kSimilarity(s1 string, s2 string) int {
if s1 == s2 {
return 0
}
queue := list.New()
queue.PushBack(State{s1, 0})
visited := make(map[string]bool)
visited[s1] = true
for queue.Len() > 0 {
front := queue.Front()
queue.Remove(front)
current := front.Value.(State)
i := 0
for current.str[i] == s2[i] {
i++
}
currentBytes := []byte(current.str)
for j := i + 1; j < len(current.str); j++ {
if current.str[j] == s2[i] &&
current.str[j] != s2[j] {
nextBytes := make([]byte, len(currentBytes))
copy(nextBytes, currentBytes)
nextBytes[i], nextBytes[j] =
nextBytes[j], nextBytes[i]
nextString := string(nextBytes)
if nextString == s2 {
return current.swaps + 1
}
if !visited[nextString] {
visited[nextString] = true
queue.PushBack(
State{
nextString,
current.swaps + 1,
},
)
}
}
}
}
return -1
}
The Go implementation follows the same BFS logic as the Python version.
Since Go strings are immutable, the code converts strings into byte slices before swapping characters. A copy of the slice is created before each modification so that different BFS branches do not interfere with each other.
The BFS queue is implemented using container/list, which provides efficient front removal operations.
The visited structure is implemented with a map[string]bool.
Worked Examples
Example 1
Input:
s1 = "ab"
s2 = "ba"
Initial queue:
| Current String | Swaps |
|---|---|
| "ab" | 0 |
The first mismatch occurs at index 0:
| Index | Current | Target |
|---|---|---|
| 0 | a | b |
We need to place 'b' into position 0.
Search for a valid swap:
| j | current[j] | Valid? |
|---|---|---|
| 1 | b | Yes |
Swap positions 0 and 1:
"ab" -> "ba"
This matches the target string.
Answer:
1
Example 2
Input:
s1 = "abc"
s2 = "bca"
Initial queue:
| Current String | Swaps |
|---|---|
| "abc" | 0 |
First mismatch:
| Index | Current | Target |
|---|---|---|
| 0 | a | b |
We need 'b' at index 0.
Possible swap:
| j | current[j] |
|---|---|
| 1 | b |
Swap:
"abc" -> "bac"
Queue becomes:
| Current String | Swaps |
|---|---|
| "bac" | 1 |
Now process "bac".
First mismatch:
| Index | Current | Target |
|---|---|---|
| 1 | a | c |
We need 'c' at index 1.
Possible swap:
| j | current[j] |
|---|---|
| 2 | c |
Swap:
"bac" -> "bca"
Target reached.
Answer:
2
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n! / pruning) | BFS explores permutations, but pruning greatly reduces branching |
| Space | O(n!) | Visited states and BFS queue may store many permutations |
Although the theoretical upper bound remains exponential, the pruning strategy is extremely effective in practice. At every step, we only explore swaps that directly improve the current mismatch, which dramatically cuts the search tree size.
Because the string length is at most 20 and the alphabet is very small, this optimized BFS is fast enough for all valid inputs.
Test Cases
solution = Solution()
assert solution.kSimilarity("ab", "ba") == 1
# Single swap needed
assert solution.kSimilarity("abc", "bca") == 2
# Requires two sequential swaps
assert solution.kSimilarity("abc", "abc") == 0
# Already equal
assert solution.kSimilarity("aabc", "abca") == 2
# Duplicate characters
assert solution.kSimilarity("abac", "baca") == 2
# Multiple valid swap paths
assert solution.kSimilarity("abcdef", "fabcde") == 5
# Long cyclic rotation
assert solution.kSimilarity("aaabbb", "bbbaaa") == 3
# Repeated characters with grouped swaps
assert solution.kSimilarity("abcde", "eabcd") == 4
# Rotation requiring multiple swaps
assert solution.kSimilarity("abab", "baba") == 2
# Duplicate alternating characters
assert solution.kSimilarity("a", "a") == 0
# Minimum length edge case
Test Summary
| Test | Why |
|---|---|
"ab" -> "ba" |
Simplest nontrivial swap |
"abc" -> "bca" |
Multi-step transformation |
"abc" -> "abc" |
Already matching strings |
"aabc" -> "abca" |
Duplicate character handling |
"abac" -> "baca" |
Multiple possible BFS paths |
"abcdef" -> "fabcde" |
Longer permutation |
"aaabbb" -> "bbbaaa" |
Repeated grouped characters |
"abcde" -> "eabcd" |
Rotation pattern |
"abab" -> "baba" |
Alternating duplicates |
"a" -> "a" |
Smallest valid input |
Edge Cases
One important edge case occurs when the strings are already equal. A careless BFS implementation might still enqueue states and perform unnecessary work. The implementation handles this immediately with an early return of 0.
Another tricky case involves duplicate characters. For example, transforming "aabc" into "abca" can produce many equivalent swap sequences. Without careful pruning and a visited set, the algorithm could repeatedly explore the same states. The implementation avoids this by storing every visited string configuration.
A third important edge case is when multiple swap candidates exist for the same mismatch. Some swaps may appear useful but actually increase future work. BFS guarantees correctness here because it explores all states with the same swap count before moving deeper. Even if one branch is suboptimal, the optimal path will still be discovered first.
A final subtle edge case involves swapping characters that are already in their correct positions. Doing so can undo previous progress and explode the search space. The condition current[j] != s2[j] prevents these unnecessary swaps and significantly improves performance while preserving correctness.