LeetCode 1347 - Minimum Number of Steps to Make Two Strings Anagram

This problem asks us to determine the minimum number of character replacements needed to transform string t into an anagram of string s. Two strings are anagrams if they contain exactly the same characters with the same frequencies, regardless of order.

LeetCode Problem 1347

Difficulty: 🟡 Medium
Topics: Hash Table, String, Counting

Solution

Problem Understanding

This problem asks us to determine the minimum number of character replacements needed to transform string t into an anagram of string s.

Two strings are anagrams if they contain exactly the same characters with the same frequencies, regardless of order. Since both strings are guaranteed to have the same length, we never need to insert or delete characters. The only allowed operation is replacing one character in t with another character.

The key observation is that we do not care about character positions. We only care about how many times each character appears in both strings.

For example, consider:

s = "bab"
t = "aba"

The frequency counts are:

s: a -> 1, b -> 2
t: a -> 2, b -> 1

String t has one extra 'a' and is missing one 'b'. Replacing one 'a' with 'b' fixes the imbalance, so the answer is 1.

The constraints are important:

  • 1 <= s.length <= 5 * 10^4
  • Only lowercase English letters appear

The input size is large enough that inefficient approaches, especially factorial or quadratic rearrangement-based methods, would be too slow. However, the alphabet size is fixed at 26 lowercase letters, which strongly suggests a counting or frequency-based solution.

Some important edge cases include:

  • The strings are already anagrams, so no replacements are needed.
  • One string contains entirely different characters from the other.
  • Very small inputs, such as strings of length 1.
  • Large inputs with repeated characters, where efficient counting becomes important.

The problem guarantees that both strings have the same length and contain only lowercase English letters, so we do not need to handle mismatched lengths or invalid characters.

Approaches

Brute Force Approach

A naive way to solve the problem would be to repeatedly search for mismatched characters between the two strings and simulate replacements.

One possible brute-force strategy is:

  1. Count frequencies of characters in s.
  2. Traverse t.
  3. If a character in t exists in the needed frequency for s, use it.
  4. Otherwise, mark it for replacement.

An even worse brute-force interpretation would involve generating permutations or trying all possible replacement combinations until an anagram is formed. Since a string of length n has n! permutations, this quickly becomes computationally impossible.

Even more practical brute-force variants still involve repeated searching and updates that may lead to O(n^2) behavior.

Although these approaches eventually find the correct answer, they are unnecessarily expensive because the problem depends only on character frequencies.

Optimal Approach

The optimal solution uses character counting.

The important insight is:

  • Every character that appears too many times in t must eventually be replaced.
  • Every character missing from t compared to s requires a replacement.
  • These two quantities are always equal because the strings have the same length.

We can count how many times each character appears in both strings and compare the frequencies.

For every character:

missing = frequency_in_s - frequency_in_t

If this value is positive, then t is missing that many copies of the character, and we must perform that many replacements.

Since there are only 26 lowercase letters, this comparison is extremely efficient.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) or worse O(1) to O(n) Repeated searching or simulation of replacements
Optimal O(n) O(1) Uses frequency counting for 26 lowercase letters

Algorithm Walkthrough

Optimal Frequency Counting Algorithm

  1. Create a frequency counter for all characters in s.

This counter tells us how many times each character should appear in the final anagram. 2. Create another frequency counter for all characters in t.

This counter tells us the current state of t. 3. Compare the frequencies of all 26 lowercase letters.

For each character:

  • If s contains more copies than t, then t is missing some occurrences.
  • Each missing occurrence requires exactly one replacement.
  1. Sum all positive differences.

Specifically:

answer += max(0, count_s - count_t)
  1. Return the accumulated total.

Why it works

The algorithm works because an anagram requires identical character frequencies.

If t lacks a certain number of characters compared to s, each missing character must be created through a replacement operation. Since every replacement simultaneously removes an extra character and creates a missing one, the total number of missing characters exactly equals the minimum number of operations required.

Python Solution

class Solution:
    def minSteps(self, s: str, t: str) -> int:
        s_count = [0] * 26
        t_count = [0] * 26

        for char in s:
            s_count[ord(char) - ord('a')] += 1

        for char in t:
            t_count[ord(char) - ord('a')] += 1

        steps = 0

        for i in range(26):
            if s_count[i] > t_count[i]:
                steps += s_count[i] - t_count[i]

        return steps

The implementation uses two fixed-size arrays of length 26 to store character frequencies.

The first loop counts characters in s. Each character is converted into an index between 0 and 25 using ASCII arithmetic.

The second loop counts characters in t in the same way.

After building both frequency arrays, the algorithm compares corresponding counts. Whenever s contains more occurrences of a character than t, the difference represents missing characters that must be introduced through replacements.

The final answer is the sum of all such deficits.

Because the alphabet size is fixed, the comparison loop always runs exactly 26 times.

Go Solution

func minSteps(s string, t string) int {
    sCount := make([]int, 26)
    tCount := make([]int, 26)

    for _, ch := range s {
        sCount[ch-'a']++
    }

    for _, ch := range t {
        tCount[ch-'a']++
    }

    steps := 0

    for i := 0; i < 26; i++ {
        if sCount[i] > tCount[i] {
            steps += sCount[i] - tCount[i]
        }
    }

    return steps
}

The Go implementation follows the same logic as the Python version.

Instead of Python lists, Go uses integer slices of size 26. Characters are processed as runes, and indexing is done using 'a' offsets.

There are no concerns about integer overflow because the maximum string length is only 5 * 10^4.

Go slices are initialized with zero values automatically, so no additional setup is needed for the frequency arrays.

Worked Examples

Example 1

s = "bab"
t = "aba"

Step 1: Count frequencies

Character Count in s Count in t
a 1 2
b 2 1

Step 2: Compute deficits

Character Needed Difference
a max(0, 1 - 2) = 0
b max(0, 2 - 1) = 1

Final Answer

steps = 1

One replacement is needed.

Example 2

s = "leetcode"
t = "practice"

Frequency Comparison

Character Count in s Count in t Missing in t
a 0 1 0
c 1 2 0
d 1 0 1
e 3 1 2
l 1 0 1
o 1 0 1
p 0 1 0
r 0 1 0
t 1 1 0
i 0 1 0

Total

1 + 2 + 1 + 1 = 5

So the answer is:

5

Example 3

s = "anagram"
t = "mangaar"

Frequency Comparison

Both strings contain:

a -> 3
n -> 1
g -> 1
r -> 1
m -> 1

All frequencies match.

Final Answer

0

No replacements are required.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each string is traversed once
Space O(1) Frequency arrays always contain exactly 26 elements

The algorithm processes every character in both strings exactly once, resulting in linear time complexity.

The space complexity is constant because the frequency arrays never grow with input size. Regardless of whether the strings contain 10 characters or 50,000 characters, the algorithm only stores counts for 26 lowercase English letters.

Test Cases

solution = Solution()

assert solution.minSteps("bab", "aba") == 1  # Basic replacement case
assert solution.minSteps("leetcode", "practice") == 5  # Multiple replacements
assert solution.minSteps("anagram", "mangaar") == 0  # Already anagrams

assert solution.minSteps("a", "b") == 1  # Single character mismatch
assert solution.minSteps("a", "a") == 0  # Single character equal

assert solution.minSteps("abc", "def") == 3  # Completely different strings
assert solution.minSteps("aaa", "bbb") == 3  # All characters must change

assert solution.minSteps("xxyyzz", "zzxxyy") == 0  # Same frequencies, different order

assert solution.minSteps("friend", "family") == 4  # Mixed overlaps and mismatches

assert solution.minSteps("aaaaaaaaaa", "aaaaaaaaaa") == 0  # Large repeated identical chars
assert solution.minSteps("aaaaaaaaaa", "bbbbbbbbbb") == 10  # Large repeated mismatch

assert solution.minSteps("abcde", "edcba") == 0  # Reverse order anagram

assert solution.minSteps("aaabbbccc", "abcabcabc") == 0  # Same counts arranged differently
Test Why
"bab", "aba" Validates basic replacement logic
"leetcode", "practice" Tests multiple deficits across characters
"anagram", "mangaar" Confirms already-anagram case
"a", "b" Smallest non-matching input
"a", "a" Smallest matching input
"abc", "def" Every character must change
"aaa", "bbb" All repeated characters mismatch
"xxyyzz", "zzxxyy" Order should not matter
"friend", "family" Partial overlap between strings
"aaaaaaaaaa", "aaaaaaaaaa" Large repeated matching characters
"aaaaaaaaaa", "bbbbbbbbbb" Large repeated mismatch
"abcde", "edcba" Reverse-order anagram
"aaabbbccc", "abcabcabc" Equal frequencies with different arrangement

Edge Cases

Strings Already Form Anagrams

One important edge case occurs when s and t already contain identical character frequencies.

For example:

s = "listen"
t = "silent"

A buggy implementation might incorrectly count positional differences and conclude that replacements are needed. However, the problem only cares about frequency counts, not ordering.

The frequency-based solution handles this correctly because all character counts match, producing a result of 0.

Completely Different Strings

Another important case is when every character differs.

Example:

s = "aaa"
t = "bbb"

Here, all three 'b' characters in t must be replaced with 'a'.

A naive implementation that tries to greedily match characters could become unnecessarily complicated. The counting approach immediately detects that t is missing three 'a' characters, so the answer is 3.

Large Inputs With Heavy Repetition

Consider inputs near the maximum constraint size:

s = "a" repeated 50,000 times
t = "b" repeated 50,000 times

An inefficient algorithm that repeatedly searches or modifies strings could become too slow.

The optimal solution remains efficient because it only performs:

  • One pass through s
  • One pass through t
  • One fixed-size comparison loop over 26 characters

This guarantees excellent performance even for the largest valid inputs.