LeetCode 3144 - Minimum Substring Partition of Equal Character Frequency
In this problem, we are given a lowercase English string s, and we must split it into one or more contiguous substrings such that every substring is balanced. A substring is considered balanced when every distinct character inside it appears the same number of times.
Difficulty: 🟡 Medium
Topics: Hash Table, String, Dynamic Programming, Counting
Solution
Problem Understanding
In this problem, we are given a lowercase English string s, and we must split it into one or more contiguous substrings such that every substring is balanced.
A substring is considered balanced when every distinct character inside it appears the same number of times. The actual frequency does not matter, only that all characters occurring in the substring share an identical count.
For example:
"abab"is balanced because both'a'and'b'appear 2 times."abc"is balanced because all three characters appear once."ccdd"is balanced because'c'and'd'each appear twice."abcc"is not balanced because frequencies are{a:1, b:1, c:2}.
The task is to return the minimum number of balanced substrings needed to partition the entire string.
The input size is relatively small:
1 <= s.length <= 1000- Only lowercase English letters appear
The length limit of 1000 is extremely important. It tells us that:
- Exponential brute force partitioning is too slow.
- Quadratic or cubic dynamic programming solutions are feasible.
- Since there are only 26 lowercase letters, frequency counting operations can often be treated as constant time.
Several edge cases are important:
- A single character is always balanced.
- The entire string itself may already be balanced.
- Some strings may require every character to stand alone.
- Repeated frequency updates must be handled carefully while expanding substrings.
- We must avoid recomputing character counts repeatedly for every substring.
Approaches
Brute Force Approach
A naive solution would try every possible partition of the string.
At each position, we could:
- Choose every possible ending index.
- Check whether the substring is balanced.
- Recursively solve the remaining suffix.
- Take the minimum number of partitions.
This works because every valid partition configuration is explored.
However, the number of possible partitions grows exponentially. A string of length n has 2^(n-1) possible partition patterns. Even though checking whether a substring is balanced only takes limited work, the overall complexity becomes far too large for n = 1000.
Key Insight
The important observation is that this is naturally a dynamic programming problem.
Suppose we define:
dp[i]= minimum number of balanced substrings needed to partitions[i:]
Then:
- For every ending position
j >= i - If
s[i:j+1]is balanced - Then we can transition:
dp[i] = min(dp[i], 1 + dp[j+1])
The remaining challenge is efficiently determining whether a substring is balanced.
A substring is balanced if:
- Every character that appears has the same frequency.
Since there are only 26 lowercase letters, we can incrementally maintain frequencies while extending the substring from i to j.
For each extension:
- Update the frequency of the new character.
- Track the maximum frequency.
- Track how many distinct characters exist.
If:
max_frequency * distinct_characters == substring_length
then every appearing character must have identical frequency.
This gives us an efficient O(n^2) solution.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n * n) | O(n) | Explores all partition combinations recursively |
| Optimal Dynamic Programming | O(n^2) | O(n) | Uses DP with incremental frequency counting |
Algorithm Walkthrough
- Define a DP array
dpof sizen + 1.
dp[i] represents the minimum number of balanced substrings needed to partition the suffix starting at index i.
2. Initialize the base case.
When we reach the end of the string, no partitions are needed:
dp[n] = 0
- Process indices from right to left.
We compute answers for smaller suffixes first so later states are already available.
4. For each starting index i, create a frequency array of size 26.
This array tracks character frequencies while expanding the substring. 5. Expand the substring one character at a time.
For every ending index j from i to n-1:
-
Add
s[j]to the frequency array. -
Update:
-
max_freq -
distinct_chars
- Check whether the current substring is balanced.
Let:
length = j - i + 1
The substring is balanced if:
max_freq * distinct_chars == length
Why does this work?
- Suppose the largest frequency is
k - If there are
ddistinct characters - And total length equals
k * d - Then every appearing character must have frequency exactly
k
- If the substring is balanced, update the DP state.
dp[i] = min(dp[i], 1 + dp[j+1])
- After processing all substrings starting at
i, continue to the previous index. - Return
dp[0].
This represents the minimum partitions for the entire string.
Why it works
The algorithm works because dynamic programming guarantees that every suffix is solved optimally before it is reused.
For each position i, we examine every possible balanced substring starting there. Every valid partition must begin with one of these substrings, so taking the minimum over all choices guarantees the optimal answer.
The balance condition is correct because if the substring length equals:
max_frequency * number_of_distinct_characters
then every appearing character must share the same frequency.
Python Solution
class Solution:
def minimumSubstringsInPartition(self, s: str) -> int:
n = len(s)
dp = [float("inf")] * (n + 1)
dp[n] = 0
for i in range(n - 1, -1, -1):
freq = [0] * 26
max_freq = 0
distinct = 0
for j in range(i, n):
idx = ord(s[j]) - ord('a')
if freq[idx] == 0:
distinct += 1
freq[idx] += 1
max_freq = max(max_freq, freq[idx])
length = j - i + 1
if max_freq * distinct == length:
dp[i] = min(dp[i], 1 + dp[j + 1])
return dp[0]
The solution starts by creating a DP array where dp[i] stores the minimum partitions needed for the suffix beginning at index i.
We iterate backward because each state depends on future states such as dp[j+1].
For every starting index, we dynamically grow substrings using the inner loop. Instead of recomputing frequencies from scratch for every substring, we incrementally update a fixed-size frequency array.
The variables max_freq and distinct allow us to test the balanced condition in constant time.
Whenever a balanced substring is found, we update the current DP state using the optimal answer already computed for the remaining suffix.
Finally, dp[0] contains the answer for the full string.
Go Solution
func minimumSubstringsInPartition(s string) int {
n := len(s)
dp := make([]int, n+1)
const INF = int(1e9)
for i := 0; i < n; i++ {
dp[i] = INF
}
dp[n] = 0
for i := n - 1; i >= 0; i-- {
freq := make([]int, 26)
maxFreq := 0
distinct := 0
for j := i; j < n; j++ {
idx := int(s[j] - 'a')
if freq[idx] == 0 {
distinct++
}
freq[idx]++
if freq[idx] > maxFreq {
maxFreq = freq[idx]
}
length := j - i + 1
if maxFreq*distinct == length {
if 1+dp[j+1] < dp[i] {
dp[i] = 1 + dp[j+1]
}
}
}
}
return dp[0]
}
The Go implementation follows the same logic as the Python solution.
A large constant INF is used instead of Python's float("inf").
The frequency array is implemented using a slice of length 26. Since Go strings are byte indexed and the input only contains lowercase English letters, direct indexing with s[j] - 'a' is safe and efficient.
No special handling for integer overflow is needed because the maximum answer is at most 1000.
Worked Examples
Example 1
Input:
s = "fabccddg"
We compute DP from right to left.
Processing index 7
Substring exploration:
| Substring | Frequencies | Balanced | dp Update |
|---|---|---|---|
"g" |
{g:1} |
Yes | dp[7] = 1 |
Processing index 6
| Substring | Frequencies | Balanced |
|---|---|---|
"d" |
{d:1} |
Yes |
"dg" |
{d:1,g:1} |
Yes |
Best result:
dp[6] = 1
because "dg" is balanced.
Processing index 4
| Substring | Frequencies | Balanced |
|---|---|---|
"c" |
{c:1} |
Yes |
"cc" |
{c:2} |
Yes |
"ccd" |
{c:2,d:1} |
No |
"ccdd" |
{c:2,d:2} |
Yes |
The partition:
"ccdd" + "g"
gives:
1 + dp[8] = 2
Final Result
Optimal partition:
("fabc", "cd", "dg")
or
("fab", "ccdd", "g")
Answer:
3
Example 2
Input:
s = "abababaccddb"
Consider the partition:
("abab", "abaccddb")
First substring
"abab"
Frequencies:
| Character | Count |
|---|---|
| a | 2 |
| b | 2 |
Balanced.
Second substring
"abaccddb"
Frequencies:
| Character | Count |
|---|---|
| a | 2 |
| b | 2 |
| c | 2 |
| d | 2 |
Balanced.
Thus:
2 substrings
which is optimal.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n^2) | Every starting index expands through all ending indices |
| Space | O(n) | DP array dominates extra memory usage |
The outer loop runs n times, and the inner loop also runs at most n times, producing O(n^2) total substring expansions.
The frequency array has constant size 26, so it does not affect asymptotic complexity.
The DP array requires O(n) memory.
Test Cases
sol = Solution()
assert sol.minimumSubstringsInPartition("fabccddg") == 3
# Example from problem statement
assert sol.minimumSubstringsInPartition("abababaccddb") == 2
# Example from problem statement
assert sol.minimumSubstringsInPartition("a") == 1
# Single character
assert sol.minimumSubstringsInPartition("aaaa") == 1
# Entire string balanced with one repeated character
assert sol.minimumSubstringsInPartition("abab") == 1
# Two characters with equal frequency
assert sol.minimumSubstringsInPartition("abc") == 1
# All distinct characters
assert sol.minimumSubstringsInPartition("abcc") == 2
# Needs partition because frequencies differ
assert sol.minimumSubstringsInPartition("abcd") == 1
# Every character appears once
assert sol.minimumSubstringsInPartition("aaabbbccc") == 1
# Multiple characters with same frequency
assert sol.minimumSubstringsInPartition("aabbbc") == 2
# Requires splitting
assert sol.minimumSubstringsInPartition("zzzzzz") == 1
# One repeated character only
assert sol.minimumSubstringsInPartition("abcabcabc") == 1
# Uniform repeated pattern
assert sol.minimumSubstringsInPartition("aabbccd") == 2
# One extra character forces partition
assert sol.minimumSubstringsInPartition("abcdefghijklmnopqrstuvwxyz") == 1
# Maximum distinct letters
assert sol.minimumSubstringsInPartition("abacbc") == 1
# Frequencies all equal at end
Test Summary
| Test | Why |
|---|---|
"fabccddg" |
Validates provided example |
"abababaccddb" |
Validates larger balanced grouping |
"a" |
Smallest possible input |
"aaaa" |
Single repeated character |
"abab" |
Equal frequency two-character case |
"abc" |
All distinct characters |
"abcc" |
Requires partitioning |
"abcd" |
Distinct characters balanced |
"aaabbbccc" |
Larger equal-frequency case |
"aabbbc" |
Mixed frequencies |
"zzzzzz" |
One-character alphabet |
"abcabcabc" |
Repeating balanced structure |
"aabbccd" |
Near-balanced input |
"abcdefghijklmnopqrstuvwxyz" |
Maximum distinct letters |
"abacbc" |
Frequencies equal after reordering |
Edge Cases
One important edge case is a string containing only one character, such as "a". A naive implementation might incorrectly assume at least two characters are needed for balance. In reality, any single-character substring is balanced because all appearing characters have equal frequency. The algorithm handles this naturally because the substring length is 1, distinct = 1, and max_freq = 1.
Another important edge case is when the entire string is already balanced, such as "aaabbbccc". Some implementations may unnecessarily split the string into smaller pieces if they greedily partition early. The dynamic programming approach avoids this issue because it evaluates every possible balanced substring and chooses the minimum number of partitions globally.
A third important edge case occurs when frequencies become uneven late in the substring, such as "abcc". The prefix "abc" is balanced, but extending to "abcc" breaks the condition. Incremental frequency updates must therefore correctly reevaluate balance after every character addition. The implementation handles this by recalculating the condition:
max_freq * distinct == length
after every extension.
A fourth subtle edge case is strings with only one distinct character, such as "zzzzzz". Every prefix is balanced because there is only one frequency involved. The implementation correctly recognizes all such substrings as balanced since:
max_freq = substring_length
distinct = 1
which always satisfies the condition.