LeetCode 3784 - Minimum Deletion Cost to Make All Characters Equal
The problem asks us to minimize the total cost of deleting characters from a string such that the resulting string contains only one unique character.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Enumeration
Solution
Problem Understanding
The problem asks us to minimize the total cost of deleting characters from a string such that the resulting string contains only one unique character. We are given a string s of length n and an array cost of the same length, where cost[i] is the cost of deleting the character s[i]. The goal is to compute the minimum sum of costs required to delete characters so that all remaining characters in the string are identical.
In other words, we want to "collapse" each group of identical consecutive characters to a single character while paying the minimal deletion cost. We are guaranteed that the input size can be up to 105 characters, and each deletion cost can be as high as 10^9. This means a naive solution that checks all possible combinations of deletions would be too slow.
Important edge cases include:
- The string is already uniform, like
"zzzzz", where the cost should be0. - Characters alternate, like
"abab", where deletions must remove almost half the characters. - Cost values vary significantly, so we must carefully choose which characters to delete within consecutive duplicates.
Approaches
Brute Force
A brute-force approach would consider every possible subset of deletions, keeping track of whether the resulting string contains only one unique character and calculating the total cost. While this would yield the correct answer, the time complexity is exponential O(2^n) because there are 2^n subsets of characters, making it completely infeasible for n up to 105.
Optimal Approach
The key insight is that we only need to consider consecutive identical characters, because non-consecutive identical characters can always coexist in the final string without conflicts. Within a sequence of consecutive identical characters, to make all characters equal while minimizing deletion cost, we should keep the character with the maximum deletion cost and delete the rest. This ensures we pay the minimal cost for deletions in that sequence.
We can iterate through the string once, maintaining a running group of consecutive identical characters, sum their deletion costs, and subtract the maximum cost in that group to get the minimal deletion cost for that segment. Summing these costs across the string gives the final answer.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n) | O(n) | Check all subsets of deletions, too slow |
| Optimal | O(n) | O(1) | Iterate through string once, handle consecutive duplicates greedily |
Algorithm Walkthrough
- Initialize a variable
total_costto0to store the accumulated minimal deletion cost. - Start iterating through the string with an index
i. - For each consecutive group of identical characters, initialize
sum_costto0andmax_costto0. - While the current character is the same as the next character, add its deletion cost to
sum_costand updatemax_costif this character’s cost is larger. - Once the end of the consecutive group is reached, compute the deletion cost for this group as
sum_cost - max_costand add it tototal_cost. - Continue iterating until the end of the string.
- Return
total_cost.
Why it works: By always keeping the character with the highest deletion cost in a consecutive group and deleting the rest, we guarantee minimal cost for that group. Processing the string sequentially ensures that all groups are handled independently and efficiently, leading to the overall minimal deletion cost.
Python Solution
from typing import List
class Solution:
def minCost(self, s: str, cost: List[int]) -> int:
total_cost = 0
n = len(s)
i = 0
while i < n:
current_char = s[i]
sum_cost = cost[i]
max_cost = cost[i]
j = i + 1
while j < n and s[j] == current_char:
sum_cost += cost[j]
max_cost = max(max_cost, cost[j])
j += 1
total_cost += sum_cost - max_cost
i = j
return total_cost
The Python code initializes a running total of deletion costs, then iterates over consecutive groups of identical characters. For each group, it calculates the total cost of deleting all characters except the one with the maximum deletion cost. The outer loop ensures all characters are processed.
Go Solution
func minCost(s string, cost []int) int64 {
var totalCost int64 = 0
n := len(s)
i := 0
for i < n {
currentChar := s[i]
sumCost := int64(cost[i])
maxCost := int64(cost[i])
j := i + 1
for j < n && s[j] == currentChar {
sumCost += int64(cost[j])
if int64(cost[j]) > maxCost {
maxCost = int64(cost[j])
}
j++
}
totalCost += sumCost - maxCost
i = j
}
return totalCost
}
In Go, we handle integer overflow by using int64 since individual costs can be up to 10^9 and the total sum could exceed the 32-bit integer limit. The logic mirrors the Python solution.
Worked Examples
Example 1: s = "aabaac", cost = [1,2,3,4,1,10]
| Step | i | current_char | sum_cost | max_cost | total_cost |
|---|---|---|---|---|---|
| 1 | 0 | a | 1 | 1 | 0 |
| 2 | 1 | a | 3 | 2 | 1 (sum-cost - max = 3-2) |
| 3 | 3 | b | 4 | 4 | 1 (4-4=0 added) |
| 4 | 4 | a | 5 | 4 | 2 (5-4 added) |
| 5 | 5 | c | 10 | 10 | 11 (10-10 added) |
Example 2: s = "abc", cost = [10,5,8]
Each character is unique, so we must delete all except the largest cost (10). Deletion cost = 5 + 8 = 13.
Example 3: s = "zzzzz", cost = [67,67,67,67,67]
All characters identical, deletion cost = 0.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Iterate through the string once, processing each character exactly once. |
| Space | O(1) | Only constant extra variables are used for tracking sums and max costs. |
The linear time complexity is optimal because every character must be examined at least once. Space remains constant because no extra arrays or hash maps are needed.
Test Cases
# test cases
assert Solution().minCost("aabaac", [1,2,3,4,1,10]) == 11 # example 1
assert Solution().minCost("abc", [10,5,8]) == 13 # example 2
assert Solution().minCost("zzzzz", [67,67,67,67,67]) == 0 # example 3
assert Solution().minCost("a", [100]) == 0 # single character, no deletion needed
assert Solution().minCost("aa", [1,2]) == 1 # two identical, delete smaller cost
assert Solution().minCost("ab", [5,10]) == 5 # two different, delete smaller cost
assert Solution().minCost("aaabbb", [1,2,3,4,5,6]) == 12 # multiple groups, sum of deletions
| Test | Why |
|---|---|
"aabaac" |
Consecutive duplicates with mixed costs |
"abc" |
All unique characters, must delete to leave one |
"zzzzz" |
Already uniform, minimal cost = 0 |
"a" |
Single character, edge case |
"aa" |
Two identical characters, test keeping max cost |
"ab" |
Two different characters, test basic deletion |
"aaabbb" |
Multiple consecutive groups, sum deletions for each |
Edge Cases
- Single character string:
"a"with cost[100]. The algorithm must return0because there is nothing to delete. This tests handling minimal input sizes and prevents indexing errors. - All characters identical with varying costs:
"aaaa"with cost[1,100,1,1]. The algorithm must correctly identify the maximum cost in the group (100) and delete the rest, yielding a total cost of3. - Alternating characters:
"ababab"with varying costs[1,2,3,4,5,6]. Each character is a consecutive group of size 1. The algorithm ensures that for each unique character sequence, it keeps the highest cost in that sequence (in this case, all single characters), and total deletion costs are calculated correctly. This tests the correctness when no consecutive duplicates exist.
This implementation handles all edge cases by processing groups of consecutive characters, using the maximum deletion cost strategy, and iterating in a single pass with constant extra space.