LeetCode 3085 - Minimum Deletions to Make String K-Special

The problem asks us to transform a given string word into a k-special string using the minimum number of deletions. A string is considered k-special if, for every pair of characters in the string, the difference in their frequencies does not exceed k.

LeetCode Problem 3085

Difficulty: 🟡 Medium
Topics: Hash Table, String, Greedy, Sorting, Counting

Solution

Problem Understanding

The problem asks us to transform a given string word into a k-special string using the minimum number of deletions. A string is considered k-special if, for every pair of characters in the string, the difference in their frequencies does not exceed k. In simpler terms, all characters' counts must be roughly balanced, with a tolerance of k.

The input consists of a string of lowercase English letters and an integer k. The output is a single integer representing the fewest deletions required to satisfy the k-special condition.

Constraints tell us that the string can be very long, up to 10^5 characters, and k can also be large, up to 10^5. This rules out brute-force solutions that would attempt to check all possible subsets of characters. The input guarantees only lowercase letters, so we can work with a fixed-size frequency table of 26 elements, which simplifies the frequency counting.

Important edge cases include:

  • k = 0, which requires all character frequencies to be equal.
  • Strings where all characters are the same, where no deletions may be required.
  • Extremely unbalanced strings, like "aaaaaaaab" with k = 1, which force significant deletions.

Approaches

Brute Force

A brute-force approach would attempt to enumerate all subsets of deletions and check if the resulting string is k-special. This would require generating exponential possibilities in terms of deletions, making it infeasible for strings of length up to 10^5. It is correct in theory because it explores all options but far too slow in practice.

Optimal Approach

The key observation is that the problem is fundamentally about balancing frequencies. We can first count the frequency of each character and then focus on choosing a range of allowed frequencies [min_freq, max_freq] that satisfy the k-special condition, i.e., max_freq - min_freq <= k. For each candidate max_freq, we can delete characters whose frequency is too high or too low to fall within the allowed range. By iterating through all feasible max_freq values (from 1 to the maximum character frequency in the string), we can efficiently find the minimum number of deletions required.

This approach works because the problem reduces to selecting a contiguous range of frequencies rather than dealing with individual character positions. Sorting or frequency counting lets us compute deletions in linear time with respect to the number of unique characters.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^n) O(n) Enumerates all deletion combinations; infeasible for n = 10^5
Optimal O(n + F^2) O(1) Count frequencies, iterate over frequency ranges; F is max frequency, n is string length

Algorithm Walkthrough

  1. Count the frequency of each character in word using a fixed-size array of length 26.
  2. Extract the non-zero frequencies and sort them in ascending order.
  3. Determine the maximum frequency max_freq in the list.
  4. Iterate target_max from 1 to max_freq. For each target_max:
  • Calculate the allowed minimum frequency target_min = target_max - k.

  • For each frequency f in the sorted list:

  • If f > target_max, increment deletions by f - target_max.

  • If f < target_min, increment deletions by f.

  1. Track the minimum deletions across all target_max iterations.
  2. Return the minimum deletions found.

Why it works: The algorithm explores all possible frequency ranges that satisfy the k-special property. By explicitly calculating the deletions needed for each range, it guarantees the minimal total deletions.

Python Solution

class Solution:
    def minimumDeletions(self, word: str, k: int) -> int:
        from collections import Counter
        
        freq = list(Counter(word).values())
        freq.sort()
        if k >= max(freq):
            return 0
        
        min_deletions = float('inf')
        max_freq = max(freq)
        
        for target_max in range(1, max_freq + 1):
            target_min = max(target_max - k, 0)
            deletions = 0
            for f in freq:
                if f > target_max:
                    deletions += f - target_max
                elif f < target_min:
                    deletions += f
            min_deletions = min(min_deletions, deletions)
        
        return min_deletions

The code first counts character frequencies and sorts them. It then iterates over potential maximum allowed frequencies, computes the deletions for frequencies outside the allowed [target_min, target_max] range, and returns the minimum total deletions across all ranges.

Go Solution

import "sort"

func minimumDeletions(word string, k int) int {
    freqMap := make(map[rune]int)
    for _, ch := range word {
        freqMap[ch]++
    }
    
    freq := make([]int, 0, len(freqMap))
    maxFreq := 0
    for _, f := range freqMap {
        freq = append(freq, f)
        if f > maxFreq {
            maxFreq = f
        }
    }
    sort.Ints(freq)
    
    minDeletions := len(word)
    
    for targetMax := 1; targetMax <= maxFreq; targetMax++ {
        targetMin := targetMax - k
        if targetMin < 0 {
            targetMin = 0
        }
        deletions := 0
        for _, f := range freq {
            if f > targetMax {
                deletions += f - targetMax
            } else if f < targetMin {
                deletions += f
            }
        }
        if deletions < minDeletions {
            minDeletions = deletions
        }
    }
    
    return minDeletions
}

The Go implementation mirrors the Python version, using a map for frequency counts and iterating over sorted frequencies. Edge cases with targetMin < 0 are handled to avoid negative deletions.

Worked Examples

Example 1: "aabcaba", k = 0

Step Frequencies target_max target_min Deletions
Initial [1, 2, 4] 2 2 3
Final answer: 3 deletions.

Example 2: "dabdcbdcdcd", k = 2

Step Frequencies target_max target_min Deletions
Sorted [1,2,3,4] 4 2 2
Final answer: 2 deletions.

Example 3: "aaabaaa", k = 2

Step Frequencies target_max target_min Deletions
Sorted [1,6] 6 4 1
Final answer: 1 deletion.

Complexity Analysis

Measure Complexity Explanation
Time O(n + F^2) Counting frequencies is O(n), iterating over frequency ranges up to max frequency F is O(F^2)
Space O(1) Only 26 character frequencies need to be stored, independent of n

Since F <= n, this approach is efficient for n up to 10^5.

Test Cases

# test cases
sol = Solution()
assert sol.minimumDeletions("aabcaba", 0) == 3  # Example 1
assert sol.minimumDeletions("dabdcbdcdcd", 2) == 2  # Example 2
assert sol.minimumDeletions("aaabaaa", 2) == 1  # Example 3
assert sol.minimumDeletions("aaaaaaa", 0) == 0  # All same letters
assert sol.minimumDeletions("abc", 0) == 2  # All different letters, k=0
assert sol.minimumDeletions("aabbcc", 1) == 0  # Already 1-special
assert sol.minimumDeletions("abcd", 10) == 0  # k large enough, no deletions
assert sol.minimumDeletions("a"*100000 + "b"*100000, 0) == 100000  # Large input
Test Why
"aabcaba", 0 Example input, requires deletions to balance
"dabdcbdcdcd", 2 Example with k > 0
"aaabaaa", 2 Example with small deletions
"aaaaaaa", 0 Edge case, no deletions needed
"abc", 0 Edge case, all different, minimal length
"aabbcc", 1 Already k-special
"abcd", 10 k large, no deletions required
Large input Tests efficiency for n = 2*10^5

Edge Cases

One edge case is when all characters are identical. This is trivial, as no deletions are needed regardless of k. Our algorithm handles this by computing frequencies and seeing that all frequencies already fall into a valid range.

Another edge case is when k is zero. This forces all frequencies