LeetCode 3445 - Maximum Difference Between Even and Odd Frequency II
We are given a string s consisting only of digits '0' through '4', and an integer k. For any substring subs whose length is at least k, we may choose two characters: - Character a must appear an odd number of times in the substring.
Difficulty: 🔴 Hard
Topics: String, Sliding Window, Enumeration, Prefix Sum
Solution
LeetCode 3445 - Maximum Difference Between Even and Odd Frequency II
Problem Understanding
We are given a string s consisting only of digits '0' through '4', and an integer k.
For any substring subs whose length is at least k, we may choose two characters:
- Character
amust appear an odd number of times in the substring. - Character
bmust appear a non-zero even number of times in the substring.
For that substring, the value contributed is:
$$freq[a] - freq[b]$$
Our goal is to find the maximum possible value over all valid substrings and all valid choices of a and b.
A crucial detail is that the substring may contain more than two distinct characters. We only care about the frequencies of the selected characters a and b.
What the Input Represents
The string length can be as large as:
$$3 \times 10^4$$
Since there are only five possible characters (0 to 4), the alphabet size is extremely small.
The output is a single integer representing the largest achievable difference between:
- an odd frequency count of one digit, and
- a non-zero even frequency count of another digit.
Constraints and Their Implications
The length limit of 30000 immediately rules out any algorithm that examines all substrings explicitly.
The number of possible substrings is:
$$O(n^2)$$
which is approximately 9 × 10^8 when n = 30000.
That is far too large.
The small alphabet size is the key observation. Since there are only five digits, there are only:
$$5 \times 4 = 20$$
ordered pairs (a, b) with a != b.
This suggests enumerating digit pairs and solving an optimized subproblem for each pair.
Important Edge Cases
A substring may contain zero occurrences of b. Such a substring is invalid because b must have a non-zero even frequency.
The optimal answer may be negative. Example 1 demonstrates this.
The entire string may be the only valid substring satisfying the parity requirements.
Many substrings can satisfy the length requirement but fail the parity requirements.
Since the alphabet size is only five, frequencies can be tracked efficiently using prefix counts.
The problem guarantees that at least one substring contains both an odd frequency character and an even frequency character, so a valid answer always exists.
Approaches
Brute Force
The most direct solution is to enumerate every substring.
For each substring:
- Compute frequencies of all digits.
- Check every ordered pair
(a, b). - Verify:
freq[a]is odd,freq[b]is positive and even.
- Update the answer with
freq[a] - freq[b].
Using prefix frequencies, each substring can be evaluated in constant time for frequencies, but there are still O(n²) substrings.
Therefore the total complexity remains:
$$O(n^2)$$
which is far too slow for n = 30000.
Key Insight
Suppose we fix an ordered pair (a, b).
For a substring [l, r], define:
$$A = freq_a(l,r)$$
$$B = freq_b(l,r)$$
We want:
$$A - B$$
with:
AoddBeven and positive
Using prefix counts:
$$A = preA[r] - preA[l]$$
$$B = preB[r] - preB[l]$$
Therefore:
$$A - B = (preA[r]-preB[r]) - (preA[l]-preB[l])$$
This resembles a maximum subarray style optimization.
The parity constraints depend only on:
$$preA[r] \bmod 2$$
$$preB[r] \bmod 2$$
and
$$preA[l] \bmod 2$$
$$preB[l] \bmod 2$$
because subtraction parity is determined entirely by prefix parities.
For each right endpoint, we need the smallest previously valid value of:
$$preA[l]-preB[l]$$
among prefixes having the required parity pattern.
A sliding window is used to enforce:
- substring length at least
k freq_b > 0
This converts the problem into maintaining minimum prefix values for four parity states.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | Enumerates all substrings |
| Optimal | O(20n) = O(n) | O(n) | Enumerate digit pairs and maintain parity minima |
Algorithm Walkthrough
For each ordered pair of distinct digits (a, b):
- Build prefix counts for digit
aand digitb. - Let:
$$diff[i] = preA[i] - preB[i]$$
for every prefix position. 3. Maintain four minimum values:
best[pa][pb]
where:
pais parity ofpreApbis parity ofpreB
Each entry stores the smallest diff seen among eligible left boundaries.
4. Sweep the string from left to right using position r.
5. Before processing position r, add all left boundaries that satisfy:
- substring length at least
k - substring contains at least one
b
The second condition can be rewritten as:
$$preB[r] - preB[l] > 0$$
which is equivalent to:
$$preB[l] < preB[r]$$ 6. Every eligible left boundary contributes its parity state:
$$(preA[l]\bmod2,; preB[l]\bmod2)$$
and updates the corresponding minimum diff.
7. For the current right endpoint, determine the parity required for the left boundary:
Since freq_a must be odd,
$$(preA[r]-preA[l])\bmod2 = 1$$
Therefore:
$$parity(preA[l]) = parity(preA[r]) \oplus 1$$
Since freq_b must be even,
$$parity(preB[l]) = parity(preB[r])$$
8. Query the corresponding parity state from best.
9. If a valid minimum exists, compute:
$$diff[r] - best[state]$$
and update the answer. 10. Repeat for all 20 ordered digit pairs.
Why it Works
For a fixed pair (a,b), every valid substring can be represented by two prefix positions. The value of the substring depends only on the difference between two prefix expressions:
$$(preA-preB)$$
The parity constraints translate into parity requirements on the chosen prefix pair. By maintaining the minimum eligible prefix value for each parity combination, we always obtain the largest possible difference for every right endpoint. Enumerating all ordered digit pairs guarantees that every possible choice of (a,b) is considered, so the global maximum is found.
Python Solution
from typing import List
class Solution:
def maxDifference(self, s: str, k: int) -> int:
n = len(s)
ans = -10**9
digits = "01234"
for a in digits:
for b in digits:
if a == b:
continue
pre_a = [0] * (n + 1)
pre_b = [0] * (n + 1)
for i, ch in enumerate(s):
pre_a[i + 1] = pre_a[i] + (ch == a)
pre_b[i + 1] = pre_b[i] + (ch == b)
best = [[float("inf")] * 2 for _ in range(2)]
left = 0
for r in range(k, n + 1):
while left <= r - k and pre_b[left] < pre_b[r]:
pa = pre_a[left] & 1
pb = pre_b[left] & 1
best[pa][pb] = min(
best[pa][pb],
pre_a[left] - pre_b[left]
)
left += 1
need_pa = (pre_a[r] & 1) ^ 1
need_pb = pre_b[r] & 1
candidate = best[need_pa][need_pb]
if candidate != float("inf"):
ans = max(
ans,
(pre_a[r] - pre_b[r]) - candidate
)
return ans
Implementation Explanation
The outer two loops enumerate every ordered digit pair (a, b). Since there are only five possible digits, this results in only twenty pairs.
For each pair, prefix counts are built. pre_a[i] stores how many times digit a appears in the first i characters, and similarly for pre_b.
The best array stores the minimum value of:
pre_a[l] - pre_b[l]
for each parity combination of (pre_a[l] % 2, pre_b[l] % 2).
As the right endpoint advances, the while loop inserts newly eligible left boundaries. Eligibility requires both:
- length at least
k - at least one occurrence of digit
b
The parity conditions determine which state we must query. If a valid state exists, the resulting substring value is computed and used to update the global answer.
Go Solution
func maxDifference(s string, k int) int {
n := len(s)
ans := -1000000000
digits := "01234"
for _, a := range digits {
for _, b := range digits {
if a == b {
continue
}
preA := make([]int, n+1)
preB := make([]int, n+1)
for i := 0; i < n; i++ {
preA[i+1] = preA[i]
preB[i+1] = preB[i]
if rune(s[i]) == a {
preA[i+1]++
}
if rune(s[i]) == b {
preB[i+1]++
}
}
const INF = int(1e9)
best := [2][2]int{
{INF, INF},
{INF, INF},
}
left := 0
for r := k; r <= n; r++ {
for left <= r-k && preB[left] < preB[r] {
pa := preA[left] & 1
pb := preB[left] & 1
diff := preA[left] - preB[left]
if diff < best[pa][pb] {
best[pa][pb] = diff
}
left++
}
needPA := (preA[r] & 1) ^ 1
needPB := preB[r] & 1
candidate := best[needPA][needPB]
if candidate != INF {
value := (preA[r] - preB[r]) - candidate
if value > ans {
ans = value
}
}
}
}
}
return ans
}
Go-Specific Notes
The logic is identical to the Python implementation.
A fixed-size best[2][2] array is used instead of a nested list. The sentinel value INF represents an uninitialized state. Integer overflow is not a concern because frequencies never exceed 30000.
Worked Examples
Example 1
Input:
s = "12233"
k = 4
Consider pair:
a = '1'
b = '3'
Prefix counts:
| i | preA | preB |
|---|---|---|
| 0 | 0 | 0 |
| 1 | 1 | 0 |
| 2 | 1 | 0 |
| 3 | 1 | 0 |
| 4 | 1 | 1 |
| 5 | 1 | 2 |
At r = 5:
freq('1') = 1
freq('3') = 2
Parity requirements:
1 is odd
2 is even and non-zero
Difference:
1 - 2 = -1
No larger valid value exists.
Answer:
-1
Example 2
Input:
s = "1122211"
k = 3
Optimal substring:
"11222"
Frequencies:
| Digit | Count |
|---|---|
| 1 | 2 |
| 2 | 3 |
Choose:
a = '2'
b = '1'
Then:
3 - 2 = 1
Answer:
1
Example 3
Input:
s = "110"
k = 3
Only substring:
"110"
Frequencies:
| Digit | Count |
|---|---|
| 1 | 2 |
| 0 | 1 |
Choose:
a = '0'
b = '1'
Difference:
1 - 2 = -1
Answer:
-1
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(20n) = O(n) | Twenty ordered digit pairs, linear scan for each |
| Space | O(n) | Prefix arrays for one digit pair |
The alphabet contains only five symbols, so the number of ordered pairs is fixed at twenty. For each pair, every index enters and leaves the sliding structure at most once, producing a linear scan. Therefore the total running time is linear in the length of the string.
Test Cases
sol = Solution()
assert sol.maxDifference("12233", 4) == -1 # example 1
assert sol.maxDifference("1122211", 3) == 1 # example 2
assert sol.maxDifference("110", 3) == -1 # example 3
assert sol.maxDifference("000", 1) == -1 # smallest repeated digit case
assert sol.maxDifference("0011", 4) == -1 # entire string only
assert sol.maxDifference("22211", 3) == 1 # odd 2s, even 1s
assert sol.maxDifference("11122", 3) == 1 # symmetric variant
assert sol.maxDifference("01234", 3) >= -1 # all digits distinct
assert sol.maxDifference("2222200", 3) == 3 # large positive difference
assert sol.maxDifference("444440000", 3) == 3 # another positive case
assert sol.maxDifference("1212121212", 2) >= 0 # alternating pattern
assert sol.maxDifference("000111222333444", 5) >= -1 # larger mixed input
Test Summary
| Test | Why |
|---|---|
"12233", 4 |
Official example 1 |
"1122211", 3 |
Official example 2 |
"110", 3 |
Official example 3 |
"000", 1 |
Single digit repeated |
"0011", 4 |
Whole string must be used |
"22211", 3 |
Positive answer with odd/even frequencies |
"11122", 3 |
Reversed frequency distribution |
"01234", 3 |
All digits unique |
"2222200", 3 |
Large positive difference |
"444440000", 3 |
Multiple repeated digits |
"1212121212", 2 |
Alternating pattern |
"000111222333444", 5 |
Larger mixed-frequency input |
Edge Cases
Substrings With No Occurrence of b
A substring where freq[b] = 0 is invalid even though zero is mathematically even. This is a common source of mistakes. The implementation explicitly enforces:
$$freq_b > 0$$
through the condition:
pre_b[left] < pre_b[r]
which guarantees at least one occurrence of b inside the substring.
Negative Answers
The maximum valid difference is not guaranteed to be positive. Example 1 and Example 3 both produce -1. An implementation that initializes the answer to 0 would incorrectly return a non-existent positive result. The solution initializes the answer to a very small value so negative answers are preserved.
Entire String Is the Only Valid Substring
When k = n, only the whole string can be considered. Some algorithms accidentally assume multiple windows exist or fail when the sliding structure receives very few candidates. The prefix-based formulation naturally handles this case because the scan still evaluates the single valid endpoint.
Parity Matching
A very subtle bug is using the wrong parity relationship. For freq[a] to be odd, the parities of the two prefixes must differ. For freq[b] to be even, the parities must match. The implementation derives the required left parity directly from the current right parity:
need_pa = parity(preA[r]) ^ 1
need_pb = parity(preB[r])
which guarantees that every accepted substring satisfies the required parity conditions.