LeetCode 1347 - Minimum Number of Steps to Make Two Strings Anagram
This problem asks us to determine the minimum number of character replacements needed to transform string t into an anagram of string s. Two strings are anagrams if they contain exactly the same characters with the same frequencies, regardless of order.
Difficulty: 🟡 Medium
Topics: Hash Table, String, Counting
Solution
Problem Understanding
This problem asks us to determine the minimum number of character replacements needed to transform string t into an anagram of string s.
Two strings are anagrams if they contain exactly the same characters with the same frequencies, regardless of order. Since both strings are guaranteed to have the same length, we never need to insert or delete characters. The only allowed operation is replacing one character in t with another character.
The key observation is that we do not care about character positions. We only care about how many times each character appears in both strings.
For example, consider:
s = "bab"
t = "aba"
The frequency counts are:
s: a -> 1, b -> 2
t: a -> 2, b -> 1
String t has one extra 'a' and is missing one 'b'. Replacing one 'a' with 'b' fixes the imbalance, so the answer is 1.
The constraints are important:
1 <= s.length <= 5 * 10^4- Only lowercase English letters appear
The input size is large enough that inefficient approaches, especially factorial or quadratic rearrangement-based methods, would be too slow. However, the alphabet size is fixed at 26 lowercase letters, which strongly suggests a counting or frequency-based solution.
Some important edge cases include:
- The strings are already anagrams, so no replacements are needed.
- One string contains entirely different characters from the other.
- Very small inputs, such as strings of length
1. - Large inputs with repeated characters, where efficient counting becomes important.
The problem guarantees that both strings have the same length and contain only lowercase English letters, so we do not need to handle mismatched lengths or invalid characters.
Approaches
Brute Force Approach
A naive way to solve the problem would be to repeatedly search for mismatched characters between the two strings and simulate replacements.
One possible brute-force strategy is:
- Count frequencies of characters in
s. - Traverse
t. - If a character in
texists in the needed frequency fors, use it. - Otherwise, mark it for replacement.
An even worse brute-force interpretation would involve generating permutations or trying all possible replacement combinations until an anagram is formed. Since a string of length n has n! permutations, this quickly becomes computationally impossible.
Even more practical brute-force variants still involve repeated searching and updates that may lead to O(n^2) behavior.
Although these approaches eventually find the correct answer, they are unnecessarily expensive because the problem depends only on character frequencies.
Optimal Approach
The optimal solution uses character counting.
The important insight is:
- Every character that appears too many times in
tmust eventually be replaced. - Every character missing from
tcompared tosrequires a replacement. - These two quantities are always equal because the strings have the same length.
We can count how many times each character appears in both strings and compare the frequencies.
For every character:
missing = frequency_in_s - frequency_in_t
If this value is positive, then t is missing that many copies of the character, and we must perform that many replacements.
Since there are only 26 lowercase letters, this comparison is extremely efficient.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) or worse | O(1) to O(n) | Repeated searching or simulation of replacements |
| Optimal | O(n) | O(1) | Uses frequency counting for 26 lowercase letters |
Algorithm Walkthrough
Optimal Frequency Counting Algorithm
- Create a frequency counter for all characters in
s.
This counter tells us how many times each character should appear in the final anagram.
2. Create another frequency counter for all characters in t.
This counter tells us the current state of t.
3. Compare the frequencies of all 26 lowercase letters.
For each character:
- If
scontains more copies thant, thentis missing some occurrences. - Each missing occurrence requires exactly one replacement.
- Sum all positive differences.
Specifically:
answer += max(0, count_s - count_t)
- Return the accumulated total.
Why it works
The algorithm works because an anagram requires identical character frequencies.
If t lacks a certain number of characters compared to s, each missing character must be created through a replacement operation. Since every replacement simultaneously removes an extra character and creates a missing one, the total number of missing characters exactly equals the minimum number of operations required.
Python Solution
class Solution:
def minSteps(self, s: str, t: str) -> int:
s_count = [0] * 26
t_count = [0] * 26
for char in s:
s_count[ord(char) - ord('a')] += 1
for char in t:
t_count[ord(char) - ord('a')] += 1
steps = 0
for i in range(26):
if s_count[i] > t_count[i]:
steps += s_count[i] - t_count[i]
return steps
The implementation uses two fixed-size arrays of length 26 to store character frequencies.
The first loop counts characters in s. Each character is converted into an index between 0 and 25 using ASCII arithmetic.
The second loop counts characters in t in the same way.
After building both frequency arrays, the algorithm compares corresponding counts. Whenever s contains more occurrences of a character than t, the difference represents missing characters that must be introduced through replacements.
The final answer is the sum of all such deficits.
Because the alphabet size is fixed, the comparison loop always runs exactly 26 times.
Go Solution
func minSteps(s string, t string) int {
sCount := make([]int, 26)
tCount := make([]int, 26)
for _, ch := range s {
sCount[ch-'a']++
}
for _, ch := range t {
tCount[ch-'a']++
}
steps := 0
for i := 0; i < 26; i++ {
if sCount[i] > tCount[i] {
steps += sCount[i] - tCount[i]
}
}
return steps
}
The Go implementation follows the same logic as the Python version.
Instead of Python lists, Go uses integer slices of size 26. Characters are processed as runes, and indexing is done using 'a' offsets.
There are no concerns about integer overflow because the maximum string length is only 5 * 10^4.
Go slices are initialized with zero values automatically, so no additional setup is needed for the frequency arrays.
Worked Examples
Example 1
s = "bab"
t = "aba"
Step 1: Count frequencies
| Character | Count in s | Count in t |
|---|---|---|
| a | 1 | 2 |
| b | 2 | 1 |
Step 2: Compute deficits
| Character | Needed Difference |
|---|---|
| a | max(0, 1 - 2) = 0 |
| b | max(0, 2 - 1) = 1 |
Final Answer
steps = 1
One replacement is needed.
Example 2
s = "leetcode"
t = "practice"
Frequency Comparison
| Character | Count in s | Count in t | Missing in t |
|---|---|---|---|
| a | 0 | 1 | 0 |
| c | 1 | 2 | 0 |
| d | 1 | 0 | 1 |
| e | 3 | 1 | 2 |
| l | 1 | 0 | 1 |
| o | 1 | 0 | 1 |
| p | 0 | 1 | 0 |
| r | 0 | 1 | 0 |
| t | 1 | 1 | 0 |
| i | 0 | 1 | 0 |
Total
1 + 2 + 1 + 1 = 5
So the answer is:
5
Example 3
s = "anagram"
t = "mangaar"
Frequency Comparison
Both strings contain:
a -> 3
n -> 1
g -> 1
r -> 1
m -> 1
All frequencies match.
Final Answer
0
No replacements are required.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each string is traversed once |
| Space | O(1) | Frequency arrays always contain exactly 26 elements |
The algorithm processes every character in both strings exactly once, resulting in linear time complexity.
The space complexity is constant because the frequency arrays never grow with input size. Regardless of whether the strings contain 10 characters or 50,000 characters, the algorithm only stores counts for 26 lowercase English letters.
Test Cases
solution = Solution()
assert solution.minSteps("bab", "aba") == 1 # Basic replacement case
assert solution.minSteps("leetcode", "practice") == 5 # Multiple replacements
assert solution.minSteps("anagram", "mangaar") == 0 # Already anagrams
assert solution.minSteps("a", "b") == 1 # Single character mismatch
assert solution.minSteps("a", "a") == 0 # Single character equal
assert solution.minSteps("abc", "def") == 3 # Completely different strings
assert solution.minSteps("aaa", "bbb") == 3 # All characters must change
assert solution.minSteps("xxyyzz", "zzxxyy") == 0 # Same frequencies, different order
assert solution.minSteps("friend", "family") == 4 # Mixed overlaps and mismatches
assert solution.minSteps("aaaaaaaaaa", "aaaaaaaaaa") == 0 # Large repeated identical chars
assert solution.minSteps("aaaaaaaaaa", "bbbbbbbbbb") == 10 # Large repeated mismatch
assert solution.minSteps("abcde", "edcba") == 0 # Reverse order anagram
assert solution.minSteps("aaabbbccc", "abcabcabc") == 0 # Same counts arranged differently
| Test | Why |
|---|---|
"bab", "aba" |
Validates basic replacement logic |
"leetcode", "practice" |
Tests multiple deficits across characters |
"anagram", "mangaar" |
Confirms already-anagram case |
"a", "b" |
Smallest non-matching input |
"a", "a" |
Smallest matching input |
"abc", "def" |
Every character must change |
"aaa", "bbb" |
All repeated characters mismatch |
"xxyyzz", "zzxxyy" |
Order should not matter |
"friend", "family" |
Partial overlap between strings |
"aaaaaaaaaa", "aaaaaaaaaa" |
Large repeated matching characters |
"aaaaaaaaaa", "bbbbbbbbbb" |
Large repeated mismatch |
"abcde", "edcba" |
Reverse-order anagram |
"aaabbbccc", "abcabcabc" |
Equal frequencies with different arrangement |
Edge Cases
Strings Already Form Anagrams
One important edge case occurs when s and t already contain identical character frequencies.
For example:
s = "listen"
t = "silent"
A buggy implementation might incorrectly count positional differences and conclude that replacements are needed. However, the problem only cares about frequency counts, not ordering.
The frequency-based solution handles this correctly because all character counts match, producing a result of 0.
Completely Different Strings
Another important case is when every character differs.
Example:
s = "aaa"
t = "bbb"
Here, all three 'b' characters in t must be replaced with 'a'.
A naive implementation that tries to greedily match characters could become unnecessarily complicated. The counting approach immediately detects that t is missing three 'a' characters, so the answer is 3.
Large Inputs With Heavy Repetition
Consider inputs near the maximum constraint size:
s = "a" repeated 50,000 times
t = "b" repeated 50,000 times
An inefficient algorithm that repeatedly searches or modifies strings could become too slow.
The optimal solution remains efficient because it only performs:
- One pass through
s - One pass through
t - One fixed-size comparison loop over 26 characters
This guarantees excellent performance even for the largest valid inputs.