LeetCode 854 - K-Similar Strings

Difficulty: 🔴 Hard
Topics: Hash Table, String, Breadth-First Search

Solution

Problem Understanding

The problem asks us to determine the minimum number of swaps needed to transform one string into another, where both strings are anagrams of each other. A single operation consists of swapping any two characters in s1. The goal is to make s1 exactly equal to s2 using as few swaps as possible.

Because the strings are anagrams, they contain exactly the same characters with the same frequencies. This guarantees that a valid transformation always exists. The challenge is not whether the transformation is possible, but rather how to achieve it with the smallest number of swaps.

The input consists of two strings:

s1, the starting string
s2, the target string

The expected output is a single integer representing the minimum number of swaps required.

The constraints are important:

The maximum string length is only 20
Characters come from only six possible letters: a through f

A length of 20 is small enough that exponential algorithms may still work if they are heavily pruned, but far too large for naive permutation generation. The limited alphabet also means many repeated characters may appear, which can create many equivalent swap states.

Several edge cases are important:

The strings may already be equal, in which case the answer is 0
Multiple identical characters can create many redundant swap choices
Some swaps may fix multiple mismatches at once
Greedy local choices do not always lead to the optimal global solution

This problem is fundamentally a shortest path search problem over string states.

Approaches

Brute Force Approach

A brute force solution would try every possible sequence of swaps until s1 becomes s2. At each step, we could generate all strings obtainable by swapping every pair of indices.

This approach is correct because it eventually explores every reachable configuration, so it must eventually discover the minimum number of swaps.

However, the number of possible states grows explosively. A string of length n has up to n! permutations, and each state can generate O(n²) new states from swaps. Even with n = 20, this is completely infeasible.

The brute force method becomes too slow because it wastes time exploring states that clearly move away from the target or repeat already visited configurations.

Optimal Approach, Breadth-First Search with Pruning

The key insight is that we only care about swaps that immediately fix a mismatch.

Suppose we scan the current string and find the first index i where it differs from s2. Any optimal solution must eventually place the correct character s2[i] into this position.

Therefore, instead of trying all swaps, we only swap index i with positions j where:

current[j] == s2[i]
current[j] != s2[j]

This dramatically reduces branching.

We then use Breadth-First Search (BFS):

Each node is a string configuration
Each edge represents one swap
BFS guarantees the first time we reach s2, we used the minimum number of swaps

Because BFS explores states level by level, the first successful transformation is optimal.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n! × n²)	O(n!)	Explores nearly all permutations
Optimal BFS with Pruning	O(n! / pruning)	O(n!)	Only explores useful swaps

The optimal approach still has exponential worst case behavior, but the pruning strategy reduces the practical search space dramatically enough to pass the constraints.

Algorithm Walkthrough

Start by checking whether s1 already equals s2. If so, return 0 immediately because no swaps are needed.
Initialize a BFS queue with the starting string s1 and a swap count of 0.
Create a visited set to avoid revisiting the same string configuration multiple times. This prevents infinite loops and redundant work.
Repeatedly remove the front state from the queue.
For the current string, scan from left to right and find the first index i where the current string differs from s2.
The character at position i must eventually become s2[i]. Therefore, search for indices j > i such that:

current[j] == s2[i]
current[j] != s2[j]

The second condition avoids breaking characters that are already correct. 7. For every valid j, create a new string by swapping characters at positions i and j. 8. If the new string equals s2, return the current swap count plus one. 9. Otherwise, if the new string has not been visited before:

Add it to visited
Push it into the BFS queue with swap count incremented by one

Continue until BFS reaches the target string.

Why it works

At every BFS level, all states represent strings reachable with the same number of swaps. Since BFS explores level by level, the first time we encounter s2 must correspond to the minimum number of swaps.

The pruning rule is also safe. Any optimal solution must eventually fix the first mismatched position, so only considering swaps that place the correct character there cannot exclude the optimal path.

Python Solution

from collections import deque
from typing import Set

class Solution:
    def kSimilarity(self, s1: str, s2: str) -> int:
        if s1 == s2:
            return 0

        queue = deque([(s1, 0)])
        visited: Set[str] = {s1}

        while queue:
            current, swaps = queue.popleft()

            i = 0
            while current[i] == s2[i]:
                i += 1

            current_list = list(current)

            for j in range(i + 1, len(current)):
                if (
                    current[j] == s2[i]
                    and current[j] != s2[j]
                ):
                    next_list = current_list[:]
                    next_list[i], next_list[j] = (
                        next_list[j],
                        next_list[i],
                    )

                    next_string = "".join(next_list)

                    if next_string == s2:
                        return swaps + 1

                    if next_string not in visited:
                        visited.add(next_string)
                        queue.append((next_string, swaps + 1))

        return -1

The implementation begins with a quick equality check. This handles the simplest case efficiently.

The BFS queue stores tuples containing:

The current string state
The number of swaps used to reach it

The visited set prevents duplicate exploration.

Inside the BFS loop, the algorithm finds the first mismatched index i. This is the key pruning optimization. Instead of trying arbitrary swaps, we only attempt swaps that place the correct character into position i.

The string is temporarily converted into a list because Python strings are immutable. After performing a swap, the list is converted back into a string.

Whenever a new valid state is generated, it is either returned immediately if it matches s2, or added to the BFS frontier for further exploration.

Go Solution

package main

import (
	"container/list"
)

type State struct {
	str   string
	swaps int
}

func kSimilarity(s1 string, s2 string) int {
	if s1 == s2 {
		return 0
	}

	queue := list.New()
	queue.PushBack(State{s1, 0})

	visited := make(map[string]bool)
	visited[s1] = true

	for queue.Len() > 0 {
		front := queue.Front()
		queue.Remove(front)

		current := front.Value.(State)

		i := 0
		for current.str[i] == s2[i] {
			i++
		}

		currentBytes := []byte(current.str)

		for j := i + 1; j < len(current.str); j++ {
			if current.str[j] == s2[i] &&
				current.str[j] != s2[j] {

				nextBytes := make([]byte, len(currentBytes))
				copy(nextBytes, currentBytes)

				nextBytes[i], nextBytes[j] =
					nextBytes[j], nextBytes[i]

				nextString := string(nextBytes)

				if nextString == s2 {
					return current.swaps + 1
				}

				if !visited[nextString] {
					visited[nextString] = true
					queue.PushBack(
						State{
							nextString,
							current.swaps + 1,
						},
					)
				}
			}
		}
	}

	return -1
}

The Go implementation follows the same BFS logic as the Python version.

Since Go strings are immutable, the code converts strings into byte slices before swapping characters. A copy of the slice is created before each modification so that different BFS branches do not interfere with each other.

The BFS queue is implemented using container/list, which provides efficient front removal operations.

The visited structure is implemented with a map[string]bool.

Worked Examples

Example 1

Input:

s1 = "ab"
s2 = "ba"

Initial queue:

Current String	Swaps
"ab"	0

The first mismatch occurs at index 0:

Index	Current	Target
0	a	b

We need to place 'b' into position 0.

Search for a valid swap:

j	current[j]	Valid?
1	b	Yes

Swap positions 0 and 1:

"ab" -> "ba"

This matches the target string.

Answer:

Example 2

Input:

s1 = "abc"
s2 = "bca"

Initial queue:

Current String	Swaps
"abc"	0

First mismatch:

Index	Current	Target
0	a	b

We need 'b' at index 0.

Possible swap:

j	current[j]
1	b

Swap:

"abc" -> "bac"

Queue becomes:

Current String	Swaps
"bac"	1

Now process "bac".

First mismatch:

Index	Current	Target
1	a	c

We need 'c' at index 1.

Possible swap:

j	current[j]
2	c

Swap:

"bac" -> "bca"

Target reached.

Answer:

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n! / pruning)	BFS explores permutations, but pruning greatly reduces branching
Space	O(n!)	Visited states and BFS queue may store many permutations

Although the theoretical upper bound remains exponential, the pruning strategy is extremely effective in practice. At every step, we only explore swaps that directly improve the current mismatch, which dramatically cuts the search tree size.

Because the string length is at most 20 and the alphabet is very small, this optimized BFS is fast enough for all valid inputs.

Test Cases

solution = Solution()

assert solution.kSimilarity("ab", "ba") == 1
# Single swap needed

assert solution.kSimilarity("abc", "bca") == 2
# Requires two sequential swaps

assert solution.kSimilarity("abc", "abc") == 0
# Already equal

assert solution.kSimilarity("aabc", "abca") == 2
# Duplicate characters

assert solution.kSimilarity("abac", "baca") == 2
# Multiple valid swap paths

assert solution.kSimilarity("abcdef", "fabcde") == 5
# Long cyclic rotation

assert solution.kSimilarity("aaabbb", "bbbaaa") == 3
# Repeated characters with grouped swaps

assert solution.kSimilarity("abcde", "eabcd") == 4
# Rotation requiring multiple swaps

assert solution.kSimilarity("abab", "baba") == 2
# Duplicate alternating characters

assert solution.kSimilarity("a", "a") == 0
# Minimum length edge case

Test Summary

Test	Why
`"ab" -> "ba"`	Simplest nontrivial swap
`"abc" -> "bca"`	Multi-step transformation
`"abc" -> "abc"`	Already matching strings
`"aabc" -> "abca"`	Duplicate character handling
`"abac" -> "baca"`	Multiple possible BFS paths
`"abcdef" -> "fabcde"`	Longer permutation
`"aaabbb" -> "bbbaaa"`	Repeated grouped characters
`"abcde" -> "eabcd"`	Rotation pattern
`"abab" -> "baba"`	Alternating duplicates
`"a" -> "a"`	Smallest valid input

Edge Cases

One important edge case occurs when the strings are already equal. A careless BFS implementation might still enqueue states and perform unnecessary work. The implementation handles this immediately with an early return of 0.

Another tricky case involves duplicate characters. For example, transforming "aabc" into "abca" can produce many equivalent swap sequences. Without careful pruning and a visited set, the algorithm could repeatedly explore the same states. The implementation avoids this by storing every visited string configuration.

A third important edge case is when multiple swap candidates exist for the same mismatch. Some swaps may appear useful but actually increase future work. BFS guarantees correctness here because it explores all states with the same swap count before moving deeper. Even if one branch is suboptimal, the optimal path will still be discovered first.

A final subtle edge case involves swapping characters that are already in their correct positions. Doing so can undo previous progress and explode the search space. The condition current[j] != s2[j] prevents these unnecessary swaps and significantly improves performance while preserving correctness.