LeetCode 3445 - Maximum Difference Between Even and Odd Frequency II

We are given a string s consisting only of digits '0' through '4', and an integer k. For any substring subs whose length is at least k, we may choose two characters: - Character a must appear an odd number of times in the substring.

LeetCode Problem 3445

Difficulty: 🔴 Hard
Topics: String, Sliding Window, Enumeration, Prefix Sum

Solution

LeetCode 3445 - Maximum Difference Between Even and Odd Frequency II

Problem Understanding

We are given a string s consisting only of digits '0' through '4', and an integer k.

For any substring subs whose length is at least k, we may choose two characters:

  • Character a must appear an odd number of times in the substring.
  • Character b must appear a non-zero even number of times in the substring.

For that substring, the value contributed is:

$$freq[a] - freq[b]$$

Our goal is to find the maximum possible value over all valid substrings and all valid choices of a and b.

A crucial detail is that the substring may contain more than two distinct characters. We only care about the frequencies of the selected characters a and b.

What the Input Represents

The string length can be as large as:

$$3 \times 10^4$$

Since there are only five possible characters (0 to 4), the alphabet size is extremely small.

The output is a single integer representing the largest achievable difference between:

  • an odd frequency count of one digit, and
  • a non-zero even frequency count of another digit.

Constraints and Their Implications

The length limit of 30000 immediately rules out any algorithm that examines all substrings explicitly.

The number of possible substrings is:

$$O(n^2)$$

which is approximately 9 × 10^8 when n = 30000.

That is far too large.

The small alphabet size is the key observation. Since there are only five digits, there are only:

$$5 \times 4 = 20$$

ordered pairs (a, b) with a != b.

This suggests enumerating digit pairs and solving an optimized subproblem for each pair.

Important Edge Cases

A substring may contain zero occurrences of b. Such a substring is invalid because b must have a non-zero even frequency.

The optimal answer may be negative. Example 1 demonstrates this.

The entire string may be the only valid substring satisfying the parity requirements.

Many substrings can satisfy the length requirement but fail the parity requirements.

Since the alphabet size is only five, frequencies can be tracked efficiently using prefix counts.

The problem guarantees that at least one substring contains both an odd frequency character and an even frequency character, so a valid answer always exists.

Approaches

Brute Force

The most direct solution is to enumerate every substring.

For each substring:

  1. Compute frequencies of all digits.
  2. Check every ordered pair (a, b).
  3. Verify:
  • freq[a] is odd,
  • freq[b] is positive and even.
  1. Update the answer with freq[a] - freq[b].

Using prefix frequencies, each substring can be evaluated in constant time for frequencies, but there are still O(n²) substrings.

Therefore the total complexity remains:

$$O(n^2)$$

which is far too slow for n = 30000.

Key Insight

Suppose we fix an ordered pair (a, b).

For a substring [l, r], define:

$$A = freq_a(l,r)$$

$$B = freq_b(l,r)$$

We want:

$$A - B$$

with:

  • A odd
  • B even and positive

Using prefix counts:

$$A = preA[r] - preA[l]$$

$$B = preB[r] - preB[l]$$

Therefore:

$$A - B = (preA[r]-preB[r]) - (preA[l]-preB[l])$$

This resembles a maximum subarray style optimization.

The parity constraints depend only on:

$$preA[r] \bmod 2$$

$$preB[r] \bmod 2$$

and

$$preA[l] \bmod 2$$

$$preB[l] \bmod 2$$

because subtraction parity is determined entirely by prefix parities.

For each right endpoint, we need the smallest previously valid value of:

$$preA[l]-preB[l]$$

among prefixes having the required parity pattern.

A sliding window is used to enforce:

  • substring length at least k
  • freq_b > 0

This converts the problem into maintaining minimum prefix values for four parity states.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Enumerates all substrings
Optimal O(20n) = O(n) O(n) Enumerate digit pairs and maintain parity minima

Algorithm Walkthrough

For each ordered pair of distinct digits (a, b):

  1. Build prefix counts for digit a and digit b.
  2. Let:

$$diff[i] = preA[i] - preB[i]$$

for every prefix position. 3. Maintain four minimum values:

best[pa][pb]

where:

  • pa is parity of preA
  • pb is parity of preB

Each entry stores the smallest diff seen among eligible left boundaries. 4. Sweep the string from left to right using position r. 5. Before processing position r, add all left boundaries that satisfy:

  • substring length at least k
  • substring contains at least one b

The second condition can be rewritten as:

$$preB[r] - preB[l] > 0$$

which is equivalent to:

$$preB[l] < preB[r]$$ 6. Every eligible left boundary contributes its parity state:

$$(preA[l]\bmod2,; preB[l]\bmod2)$$

and updates the corresponding minimum diff. 7. For the current right endpoint, determine the parity required for the left boundary:

Since freq_a must be odd,

$$(preA[r]-preA[l])\bmod2 = 1$$

Therefore:

$$parity(preA[l]) = parity(preA[r]) \oplus 1$$

Since freq_b must be even,

$$parity(preB[l]) = parity(preB[r])$$ 8. Query the corresponding parity state from best. 9. If a valid minimum exists, compute:

$$diff[r] - best[state]$$

and update the answer. 10. Repeat for all 20 ordered digit pairs.

Why it Works

For a fixed pair (a,b), every valid substring can be represented by two prefix positions. The value of the substring depends only on the difference between two prefix expressions:

$$(preA-preB)$$

The parity constraints translate into parity requirements on the chosen prefix pair. By maintaining the minimum eligible prefix value for each parity combination, we always obtain the largest possible difference for every right endpoint. Enumerating all ordered digit pairs guarantees that every possible choice of (a,b) is considered, so the global maximum is found.

Python Solution

from typing import List

class Solution:
    def maxDifference(self, s: str, k: int) -> int:
        n = len(s)
        ans = -10**9

        digits = "01234"

        for a in digits:
            for b in digits:
                if a == b:
                    continue

                pre_a = [0] * (n + 1)
                pre_b = [0] * (n + 1)

                for i, ch in enumerate(s):
                    pre_a[i + 1] = pre_a[i] + (ch == a)
                    pre_b[i + 1] = pre_b[i] + (ch == b)

                best = [[float("inf")] * 2 for _ in range(2)]

                left = 0

                for r in range(k, n + 1):
                    while left <= r - k and pre_b[left] < pre_b[r]:
                        pa = pre_a[left] & 1
                        pb = pre_b[left] & 1

                        best[pa][pb] = min(
                            best[pa][pb],
                            pre_a[left] - pre_b[left]
                        )
                        left += 1

                    need_pa = (pre_a[r] & 1) ^ 1
                    need_pb = pre_b[r] & 1

                    candidate = best[need_pa][need_pb]

                    if candidate != float("inf"):
                        ans = max(
                            ans,
                            (pre_a[r] - pre_b[r]) - candidate
                        )

        return ans

Implementation Explanation

The outer two loops enumerate every ordered digit pair (a, b). Since there are only five possible digits, this results in only twenty pairs.

For each pair, prefix counts are built. pre_a[i] stores how many times digit a appears in the first i characters, and similarly for pre_b.

The best array stores the minimum value of:

pre_a[l] - pre_b[l]

for each parity combination of (pre_a[l] % 2, pre_b[l] % 2).

As the right endpoint advances, the while loop inserts newly eligible left boundaries. Eligibility requires both:

  • length at least k
  • at least one occurrence of digit b

The parity conditions determine which state we must query. If a valid state exists, the resulting substring value is computed and used to update the global answer.

Go Solution

func maxDifference(s string, k int) int {
	n := len(s)
	ans := -1000000000
	digits := "01234"

	for _, a := range digits {
		for _, b := range digits {
			if a == b {
				continue
			}

			preA := make([]int, n+1)
			preB := make([]int, n+1)

			for i := 0; i < n; i++ {
				preA[i+1] = preA[i]
				preB[i+1] = preB[i]

				if rune(s[i]) == a {
					preA[i+1]++
				}
				if rune(s[i]) == b {
					preB[i+1]++
				}
			}

			const INF = int(1e9)

			best := [2][2]int{
				{INF, INF},
				{INF, INF},
			}

			left := 0

			for r := k; r <= n; r++ {
				for left <= r-k && preB[left] < preB[r] {
					pa := preA[left] & 1
					pb := preB[left] & 1

					diff := preA[left] - preB[left]
					if diff < best[pa][pb] {
						best[pa][pb] = diff
					}

					left++
				}

				needPA := (preA[r] & 1) ^ 1
				needPB := preB[r] & 1

				candidate := best[needPA][needPB]

				if candidate != INF {
					value := (preA[r] - preB[r]) - candidate
					if value > ans {
						ans = value
					}
				}
			}
		}
	}

	return ans
}

Go-Specific Notes

The logic is identical to the Python implementation.

A fixed-size best[2][2] array is used instead of a nested list. The sentinel value INF represents an uninitialized state. Integer overflow is not a concern because frequencies never exceed 30000.

Worked Examples

Example 1

Input:

s = "12233"
k = 4

Consider pair:

a = '1'
b = '3'

Prefix counts:

i preA preB
0 0 0
1 1 0
2 1 0
3 1 0
4 1 1
5 1 2

At r = 5:

freq('1') = 1
freq('3') = 2

Parity requirements:

1 is odd
2 is even and non-zero

Difference:

1 - 2 = -1

No larger valid value exists.

Answer:

-1

Example 2

Input:

s = "1122211"
k = 3

Optimal substring:

"11222"

Frequencies:

Digit Count
1 2
2 3

Choose:

a = '2'
b = '1'

Then:

3 - 2 = 1

Answer:

1

Example 3

Input:

s = "110"
k = 3

Only substring:

"110"

Frequencies:

Digit Count
1 2
0 1

Choose:

a = '0'
b = '1'

Difference:

1 - 2 = -1

Answer:

-1

Complexity Analysis

Measure Complexity Explanation
Time O(20n) = O(n) Twenty ordered digit pairs, linear scan for each
Space O(n) Prefix arrays for one digit pair

The alphabet contains only five symbols, so the number of ordered pairs is fixed at twenty. For each pair, every index enters and leaves the sliding structure at most once, producing a linear scan. Therefore the total running time is linear in the length of the string.

Test Cases

sol = Solution()

assert sol.maxDifference("12233", 4) == -1      # example 1
assert sol.maxDifference("1122211", 3) == 1     # example 2
assert sol.maxDifference("110", 3) == -1        # example 3

assert sol.maxDifference("000", 1) == -1        # smallest repeated digit case
assert sol.maxDifference("0011", 4) == -1       # entire string only

assert sol.maxDifference("22211", 3) == 1       # odd 2s, even 1s
assert sol.maxDifference("11122", 3) == 1       # symmetric variant

assert sol.maxDifference("01234", 3) >= -1      # all digits distinct

assert sol.maxDifference("2222200", 3) == 3     # large positive difference
assert sol.maxDifference("444440000", 3) == 3   # another positive case

assert sol.maxDifference("1212121212", 2) >= 0  # alternating pattern
assert sol.maxDifference("000111222333444", 5) >= -1  # larger mixed input

Test Summary

Test Why
"12233", 4 Official example 1
"1122211", 3 Official example 2
"110", 3 Official example 3
"000", 1 Single digit repeated
"0011", 4 Whole string must be used
"22211", 3 Positive answer with odd/even frequencies
"11122", 3 Reversed frequency distribution
"01234", 3 All digits unique
"2222200", 3 Large positive difference
"444440000", 3 Multiple repeated digits
"1212121212", 2 Alternating pattern
"000111222333444", 5 Larger mixed-frequency input

Edge Cases

Substrings With No Occurrence of b

A substring where freq[b] = 0 is invalid even though zero is mathematically even. This is a common source of mistakes. The implementation explicitly enforces:

$$freq_b > 0$$

through the condition:

pre_b[left] < pre_b[r]

which guarantees at least one occurrence of b inside the substring.

Negative Answers

The maximum valid difference is not guaranteed to be positive. Example 1 and Example 3 both produce -1. An implementation that initializes the answer to 0 would incorrectly return a non-existent positive result. The solution initializes the answer to a very small value so negative answers are preserved.

Entire String Is the Only Valid Substring

When k = n, only the whole string can be considered. Some algorithms accidentally assume multiple windows exist or fail when the sliding structure receives very few candidates. The prefix-based formulation naturally handles this case because the scan still evaluates the single valid endpoint.

Parity Matching

A very subtle bug is using the wrong parity relationship. For freq[a] to be odd, the parities of the two prefixes must differ. For freq[b] to be even, the parities must match. The implementation derives the required left parity directly from the current right parity:

need_pa = parity(preA[r]) ^ 1
need_pb = parity(preB[r])

which guarantees that every accepted substring satisfies the required parity conditions.