LeetCode 3389 - Minimum Operations to Make Character Frequencies Equal

The problem requires transforming a given string s into a good string, where a good string is defined as one in which every character appears the same number of times.

LeetCode Problem 3389

Difficulty: 🔴 Hard
Topics: Hash Table, String, Dynamic Programming, Counting, Enumeration

Solution

Problem Understanding

The problem requires transforming a given string s into a good string, where a good string is defined as one in which every character appears the same number of times. The operations allowed include deleting a character, inserting a character, or incrementing a character to the next letter in the alphabet (with the constraint that 'z' cannot wrap around to 'a'). The goal is to compute the minimum number of operations needed to achieve this.

The input, s, is a string of lowercase English letters with a length between 3 and 20,000 characters. The output is an integer representing the minimal number of operations.

Key observations include that the frequency of characters in s determines the operations required. Edge cases involve strings that are already good, strings with one dominant character, or strings with multiple characters at the maximum or minimum frequency. The constraints indicate that any solution with worse than O(n log n) complexity could be too slow for the upper limit of 20,000 characters.

Approaches

The brute-force approach would enumerate every possible target frequency f from 1 to max(counts) and calculate the number of operations required to make all character counts equal to f. For each target frequency, we would iterate over all characters and either delete, insert, or increment to reach the target. While correct, this approach is inefficient because it potentially evaluates up to 26 * max(count) operations, and for long strings, max(count) can be as high as 20,000.

The optimal approach leverages the insight that we only need to consider unique frequencies in the string. By counting the occurrences of each character, then considering how to adjust all counts to a common target frequency (either through deletions or insertions), we can use sorting and prefix sums to efficiently compute the minimal operations. Increment operations can be treated as a cost to transform one frequency to another for adjacent letters, and deletions/insertions can adjust remaining mismatches.

Approach Time Complexity Space Complexity Notes
Brute Force O(26 * n * max(count)) O(26) Tries all possible frequencies, inefficient for large s
Optimal O(n log n) O(26) Uses frequency counting, sorting, and minimal cost calculation

Algorithm Walkthrough

  1. Count the frequency of each character in the string s using a hash map or an array of size 26.

  2. Extract the non-zero frequencies into a list and sort them. Sorting allows us to efficiently compute operations to unify counts.

  3. Identify potential target frequencies. Because all characters must have the same frequency, the target frequency must be between 1 and the maximum frequency in the list.

  4. For each candidate target frequency:

  5. Calculate deletions needed for characters with frequency higher than the target.

  6. Calculate insertions needed for characters with frequency lower than the target.

  7. Track the total operations required to reach this target frequency.

  8. Return the minimum number of operations across all candidate frequencies.

Why it works: By considering every plausible target frequency derived from the current frequencies, we ensure that we can reach a uniform frequency with minimal operations. Sorting and prefix sums allow efficient computation of insertions and deletions, guaranteeing the solution is optimal.

Python Solution

class Solution:
    def makeStringGood(self, s: str) -> int:
        from collections import Counter
        
        freq = Counter(s)
        counts = sorted(freq.values())
        n = len(counts)
        min_ops = float('inf')
        
        # Consider every target frequency
        for target in range(1, max(counts) + 1):
            ops = 0
            for count in counts:
                if count > target:
                    ops += count - target  # delete extra
                elif count < target:
                    ops += target - count  # insert missing
            min_ops = min(min_ops, ops)
        
        return min_ops

The code first counts the frequency of each character. Then, for each potential target frequency from 1 to the maximum frequency, it calculates the number of deletions and insertions needed to achieve a uniform frequency. It keeps track of the minimal number of operations across all target frequencies.

Go Solution

func makeStringGood(s string) int {
    freq := make([]int, 26)
    for _, ch := range s {
        freq[ch-'a']++
    }

    counts := []int{}
    for _, f := range freq {
        if f > 0 {
            counts = append(counts, f)
        }
    }

    n := len(counts)
    minOps := 1<<31 - 1

    maxCount := 0
    for _, c := range counts {
        if c > maxCount {
            maxCount = c
        }
    }

    for target := 1; target <= maxCount; target++ {
        ops := 0
        for _, c := range counts {
            if c > target {
                ops += c - target
            } else if c < target {
                ops += target - c
            }
        }
        if ops < minOps {
            minOps = ops
        }
    }

    return minOps
}

The Go version mirrors the Python logic. It uses a fixed-size array for frequencies and constructs a slice of non-zero counts. The target frequency is iterated from 1 to the maximum frequency, and operations are calculated similarly. Go’s int handles the sum safely given the input constraints.

Worked Examples

Example 1: s = "acab"

Counts: a:2, c:1, b:1 → [1,1,2]

Target frequencies considered: 1,2

  • Target 1 → ops = (2-1) + (1-1) + (1-1) = 1
  • Target 2 → ops = (2-2) + (2-1) + (2-1) = 2

Minimum operations = 1

Example 2: s = "wddw"

Counts: w:2, d:2 → [2,2]

Target frequencies considered: 1,2

  • Target 1 → ops = (2-1)+(2-1) = 2
  • Target 2 → ops = (2-2)+(2-2) = 0

Minimum operations = 0

Example 3: s = "aaabc"

Counts: a:3, b:1, c:1 → [1,1,3]

Target frequencies considered: 1,2,3

  • Target 1 → ops = (1-1)+(1-1)+(3-1)=2
  • Target 2 → ops = (2-1)+(2-1)+(3-2)=3
  • Target 3 → ops = (3-3)+(3-1)+(3-1)=4

Minimum operations = 2

Complexity Analysis

Measure Complexity Explanation
Time O(n log n + maxFreq * 26) ≈ O(n log n) Count frequencies in O(n), sort in O(26 log 26) ≈ O(1), iterate max frequency range
Space O(26) ≈ O(1) Frequency array and counts slice store at most 26 elements

The time complexity is dominated by the frequency counting step and the small sorting step. Space is negligible due to fixed alphabet size.

Test Cases

# Provided examples
assert Solution().makeStringGood("acab") == 1  # delete one 'a'
assert Solution().makeStringGood("wddw") == 0  # already good
assert Solution().makeStringGood("aaabc") == 2  # change 'a'->'b' and insert 'c'

# Edge cases
assert Solution().makeStringGood("aaa") == 0  # single character repeated
assert Solution().makeStringGood("abc") == 0  # already good
assert Solution().makeStringGood("aabbccdde") == 1  # minimal adjustment
assert Solution().makeStringGood("a"*20000) == 0  # maximal string, single character
assert Solution().makeStringGood("abcdefghijklmnopqrstuvwxyz") == 0  # all unique
Test Why
"acab" Tests deletion for minimal operation
"wddw" Already good string
"aaabc" Requires mixed operations (insert/change)
"aaa" Single character repeated edge case
"abc" Already uniform frequencies
"aabbccdde" Uneven counts needing minimal adjustment
"a"*20000 Max input size with one character
"abcdefghijklmnopqrstuvwxyz" All unique characters

Edge Cases

  1. Single Character Repeated: Strings like "aaaa" require no operations. The implementation counts the frequencies and recognizes uniformity immediately.
  2. All Unique Characters: Strings like "abcdefghijklmnopqrstuvwxyz" are inherently good if the frequency target is 1. The algorithm correctly identifies that no operations are needed.
  3. Large Uneven Strings: For strings with extreme frequency differences like "aaaaaab", the algorithm correctly computes deletions or insertions to achieve a balanced state, ensuring it handles large differences without integer overflow or miscounting.

This approach robustly handles both minimal strings and maximal edge inputs efficiently.