LeetCode 3144 - Minimum Substring Partition of Equal Character Frequency

In this problem, we are given a lowercase English string s, and we must split it into one or more contiguous substrings such that every substring is balanced. A substring is considered balanced when every distinct character inside it appears the same number of times.

LeetCode Problem 3144

Difficulty: 🟡 Medium
Topics: Hash Table, String, Dynamic Programming, Counting

Solution

Problem Understanding

In this problem, we are given a lowercase English string s, and we must split it into one or more contiguous substrings such that every substring is balanced.

A substring is considered balanced when every distinct character inside it appears the same number of times. The actual frequency does not matter, only that all characters occurring in the substring share an identical count.

For example:

  • "abab" is balanced because both 'a' and 'b' appear 2 times.
  • "abc" is balanced because all three characters appear once.
  • "ccdd" is balanced because 'c' and 'd' each appear twice.
  • "abcc" is not balanced because frequencies are {a:1, b:1, c:2}.

The task is to return the minimum number of balanced substrings needed to partition the entire string.

The input size is relatively small:

  • 1 <= s.length <= 1000
  • Only lowercase English letters appear

The length limit of 1000 is extremely important. It tells us that:

  • Exponential brute force partitioning is too slow.
  • Quadratic or cubic dynamic programming solutions are feasible.
  • Since there are only 26 lowercase letters, frequency counting operations can often be treated as constant time.

Several edge cases are important:

  • A single character is always balanced.
  • The entire string itself may already be balanced.
  • Some strings may require every character to stand alone.
  • Repeated frequency updates must be handled carefully while expanding substrings.
  • We must avoid recomputing character counts repeatedly for every substring.

Approaches

Brute Force Approach

A naive solution would try every possible partition of the string.

At each position, we could:

  1. Choose every possible ending index.
  2. Check whether the substring is balanced.
  3. Recursively solve the remaining suffix.
  4. Take the minimum number of partitions.

This works because every valid partition configuration is explored.

However, the number of possible partitions grows exponentially. A string of length n has 2^(n-1) possible partition patterns. Even though checking whether a substring is balanced only takes limited work, the overall complexity becomes far too large for n = 1000.

Key Insight

The important observation is that this is naturally a dynamic programming problem.

Suppose we define:

  • dp[i] = minimum number of balanced substrings needed to partition s[i:]

Then:

  • For every ending position j >= i
  • If s[i:j+1] is balanced
  • Then we can transition:
dp[i] = min(dp[i], 1 + dp[j+1])

The remaining challenge is efficiently determining whether a substring is balanced.

A substring is balanced if:

  • Every character that appears has the same frequency.

Since there are only 26 lowercase letters, we can incrementally maintain frequencies while extending the substring from i to j.

For each extension:

  • Update the frequency of the new character.
  • Track the maximum frequency.
  • Track how many distinct characters exist.

If:

max_frequency * distinct_characters == substring_length

then every appearing character must have identical frequency.

This gives us an efficient O(n^2) solution.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(2^n * n) O(n) Explores all partition combinations recursively
Optimal Dynamic Programming O(n^2) O(n) Uses DP with incremental frequency counting

Algorithm Walkthrough

  1. Define a DP array dp of size n + 1.

dp[i] represents the minimum number of balanced substrings needed to partition the suffix starting at index i. 2. Initialize the base case.

When we reach the end of the string, no partitions are needed:

dp[n] = 0
  1. Process indices from right to left.

We compute answers for smaller suffixes first so later states are already available. 4. For each starting index i, create a frequency array of size 26.

This array tracks character frequencies while expanding the substring. 5. Expand the substring one character at a time.

For every ending index j from i to n-1:

  • Add s[j] to the frequency array.

  • Update:

  • max_freq

  • distinct_chars

  1. Check whether the current substring is balanced.

Let:

  • length = j - i + 1

The substring is balanced if:

max_freq * distinct_chars == length

Why does this work?

  • Suppose the largest frequency is k
  • If there are d distinct characters
  • And total length equals k * d
  • Then every appearing character must have frequency exactly k
  1. If the substring is balanced, update the DP state.
dp[i] = min(dp[i], 1 + dp[j+1])
  1. After processing all substrings starting at i, continue to the previous index.
  2. Return dp[0].

This represents the minimum partitions for the entire string.

Why it works

The algorithm works because dynamic programming guarantees that every suffix is solved optimally before it is reused.

For each position i, we examine every possible balanced substring starting there. Every valid partition must begin with one of these substrings, so taking the minimum over all choices guarantees the optimal answer.

The balance condition is correct because if the substring length equals:

max_frequency * number_of_distinct_characters

then every appearing character must share the same frequency.

Python Solution

class Solution:
    def minimumSubstringsInPartition(self, s: str) -> int:
        n = len(s)

        dp = [float("inf")] * (n + 1)
        dp[n] = 0

        for i in range(n - 1, -1, -1):
            freq = [0] * 26

            max_freq = 0
            distinct = 0

            for j in range(i, n):
                idx = ord(s[j]) - ord('a')

                if freq[idx] == 0:
                    distinct += 1

                freq[idx] += 1
                max_freq = max(max_freq, freq[idx])

                length = j - i + 1

                if max_freq * distinct == length:
                    dp[i] = min(dp[i], 1 + dp[j + 1])

        return dp[0]

The solution starts by creating a DP array where dp[i] stores the minimum partitions needed for the suffix beginning at index i.

We iterate backward because each state depends on future states such as dp[j+1].

For every starting index, we dynamically grow substrings using the inner loop. Instead of recomputing frequencies from scratch for every substring, we incrementally update a fixed-size frequency array.

The variables max_freq and distinct allow us to test the balanced condition in constant time.

Whenever a balanced substring is found, we update the current DP state using the optimal answer already computed for the remaining suffix.

Finally, dp[0] contains the answer for the full string.

Go Solution

func minimumSubstringsInPartition(s string) int {
	n := len(s)

	dp := make([]int, n+1)

	const INF = int(1e9)

	for i := 0; i < n; i++ {
		dp[i] = INF
	}

	dp[n] = 0

	for i := n - 1; i >= 0; i-- {
		freq := make([]int, 26)

		maxFreq := 0
		distinct := 0

		for j := i; j < n; j++ {
			idx := int(s[j] - 'a')

			if freq[idx] == 0 {
				distinct++
			}

			freq[idx]++

			if freq[idx] > maxFreq {
				maxFreq = freq[idx]
			}

			length := j - i + 1

			if maxFreq*distinct == length {
				if 1+dp[j+1] < dp[i] {
					dp[i] = 1 + dp[j+1]
				}
			}
		}
	}

	return dp[0]
}

The Go implementation follows the same logic as the Python solution.

A large constant INF is used instead of Python's float("inf").

The frequency array is implemented using a slice of length 26. Since Go strings are byte indexed and the input only contains lowercase English letters, direct indexing with s[j] - 'a' is safe and efficient.

No special handling for integer overflow is needed because the maximum answer is at most 1000.

Worked Examples

Example 1

Input:

s = "fabccddg"

We compute DP from right to left.

Processing index 7

Substring exploration:

Substring Frequencies Balanced dp Update
"g" {g:1} Yes dp[7] = 1

Processing index 6

Substring Frequencies Balanced
"d" {d:1} Yes
"dg" {d:1,g:1} Yes

Best result:

dp[6] = 1

because "dg" is balanced.

Processing index 4

Substring Frequencies Balanced
"c" {c:1} Yes
"cc" {c:2} Yes
"ccd" {c:2,d:1} No
"ccdd" {c:2,d:2} Yes

The partition:

"ccdd" + "g"

gives:

1 + dp[8] = 2

Final Result

Optimal partition:

("fabc", "cd", "dg")

or

("fab", "ccdd", "g")

Answer:

3

Example 2

Input:

s = "abababaccddb"

Consider the partition:

("abab", "abaccddb")

First substring

"abab"

Frequencies:

Character Count
a 2
b 2

Balanced.

Second substring

"abaccddb"

Frequencies:

Character Count
a 2
b 2
c 2
d 2

Balanced.

Thus:

2 substrings

which is optimal.

Complexity Analysis

Measure Complexity Explanation
Time O(n^2) Every starting index expands through all ending indices
Space O(n) DP array dominates extra memory usage

The outer loop runs n times, and the inner loop also runs at most n times, producing O(n^2) total substring expansions.

The frequency array has constant size 26, so it does not affect asymptotic complexity.

The DP array requires O(n) memory.

Test Cases

sol = Solution()

assert sol.minimumSubstringsInPartition("fabccddg") == 3
# Example from problem statement

assert sol.minimumSubstringsInPartition("abababaccddb") == 2
# Example from problem statement

assert sol.minimumSubstringsInPartition("a") == 1
# Single character

assert sol.minimumSubstringsInPartition("aaaa") == 1
# Entire string balanced with one repeated character

assert sol.minimumSubstringsInPartition("abab") == 1
# Two characters with equal frequency

assert sol.minimumSubstringsInPartition("abc") == 1
# All distinct characters

assert sol.minimumSubstringsInPartition("abcc") == 2
# Needs partition because frequencies differ

assert sol.minimumSubstringsInPartition("abcd") == 1
# Every character appears once

assert sol.minimumSubstringsInPartition("aaabbbccc") == 1
# Multiple characters with same frequency

assert sol.minimumSubstringsInPartition("aabbbc") == 2
# Requires splitting

assert sol.minimumSubstringsInPartition("zzzzzz") == 1
# One repeated character only

assert sol.minimumSubstringsInPartition("abcabcabc") == 1
# Uniform repeated pattern

assert sol.minimumSubstringsInPartition("aabbccd") == 2
# One extra character forces partition

assert sol.minimumSubstringsInPartition("abcdefghijklmnopqrstuvwxyz") == 1
# Maximum distinct letters

assert sol.minimumSubstringsInPartition("abacbc") == 1
# Frequencies all equal at end

Test Summary

Test Why
"fabccddg" Validates provided example
"abababaccddb" Validates larger balanced grouping
"a" Smallest possible input
"aaaa" Single repeated character
"abab" Equal frequency two-character case
"abc" All distinct characters
"abcc" Requires partitioning
"abcd" Distinct characters balanced
"aaabbbccc" Larger equal-frequency case
"aabbbc" Mixed frequencies
"zzzzzz" One-character alphabet
"abcabcabc" Repeating balanced structure
"aabbccd" Near-balanced input
"abcdefghijklmnopqrstuvwxyz" Maximum distinct letters
"abacbc" Frequencies equal after reordering

Edge Cases

One important edge case is a string containing only one character, such as "a". A naive implementation might incorrectly assume at least two characters are needed for balance. In reality, any single-character substring is balanced because all appearing characters have equal frequency. The algorithm handles this naturally because the substring length is 1, distinct = 1, and max_freq = 1.

Another important edge case is when the entire string is already balanced, such as "aaabbbccc". Some implementations may unnecessarily split the string into smaller pieces if they greedily partition early. The dynamic programming approach avoids this issue because it evaluates every possible balanced substring and chooses the minimum number of partitions globally.

A third important edge case occurs when frequencies become uneven late in the substring, such as "abcc". The prefix "abc" is balanced, but extending to "abcc" breaks the condition. Incremental frequency updates must therefore correctly reevaluate balance after every character addition. The implementation handles this by recalculating the condition:

max_freq * distinct == length

after every extension.

A fourth subtle edge case is strings with only one distinct character, such as "zzzzzz". Every prefix is balanced because there is only one frequency involved. The implementation correctly recognizes all such substrings as balanced since:

max_freq = substring_length
distinct = 1

which always satisfies the condition.