LeetCode 2067 - Number of Equal Count Substrings

The problem asks us to count the number of substrings in a string s such that every unique character in the substring occurs exactly count times. A substring is a contiguous segment of the string, so we are only considering consecutive characters.

LeetCode Problem 2067

Difficulty: 🟡 Medium
Topics: Hash Table, String, Sliding Window, Counting

Solution

Problem Understanding

The problem asks us to count the number of substrings in a string s such that every unique character in the substring occurs exactly count times. A substring is a contiguous segment of the string, so we are only considering consecutive characters.

Inputs are:

  • s: a string of lowercase English letters, length up to 3 * 10^4.
  • count: an integer from 1 to 3 * 10^4, representing the exact number of times each letter in a substring must appear.

The output is a single integer, the number of substrings satisfying the condition.

Key insights about the problem:

  • Substrings can overlap; the same character may appear in multiple valid substrings.
  • If count is greater than the total occurrences of any character, that character cannot contribute to a valid substring.
  • Maximum substring length is determined by count times the number of unique characters in that substring.
  • Edge cases include very short strings, very large count, and strings where no character occurs count times.

Potential traps:

  • Brute-force enumeration of all substrings will be too slow for s of length up to 3 * 10^4.
  • Must carefully manage frequency counts when expanding or shrinking substrings.

Approaches

Brute Force

The brute-force approach would generate all possible substrings of s and, for each substring, count the occurrences of each character. If all unique characters occur exactly count times, increment the answer.

While correct, this is too slow. Generating all substrings has O(n²) complexity, and counting character frequencies in each substring adds an O(n) factor, resulting in O(n³) time, which is infeasible for n up to 30,000.

Optimal Approach

The key insight is that valid substrings must have lengths that are multiples of count. Let the number of unique characters in a substring be k. Then the substring length must be exactly k * count. This observation allows us to enumerate the number of unique characters k from 1 to 26 and use a sliding window of length k * count to efficiently check whether the substring meets the criteria.

The sliding window keeps a frequency count of letters in the window and checks if all present letters have frequency exactly count. By moving the window across the string, we can efficiently count valid substrings.

Approach Time Complexity Space Complexity Notes
Brute Force O(n³) O(26) Enumerate all substrings, count frequencies
Optimal O(26 * n) = O(n) O(26) = O(1) Sliding window for each possible unique character count

Algorithm Walkthrough

  1. Loop over possible numbers of unique characters k from 1 to 26. This represents the number of distinct letters in a valid substring.
  2. Compute the required window length window_len = k * count. Skip if window_len > len(s) since no substring can be longer than the string.
  3. Use a sliding window of length window_len to traverse s. Maintain a hash map or array of size 26 to count letter frequencies in the current window.
  4. Also maintain a count of letters that appear exactly count times in the window.
  5. For each window, check if the number of letters with exactly count matches k. If yes, increment the answer.
  6. Move the window one character at a time: add the new character to the count map and remove the old character. Update the "exact count" tracker accordingly.
  7. Continue until all windows for the current k are processed, then move to the next k.

Why it works:

By iterating over the possible numbers of unique letters, we guarantee that all potential substring lengths are considered. The sliding window ensures that each substring is checked in O(1) per shift using the frequency map and exact-count tracker, giving an efficient solution.

Python Solution

class Solution:
    def equalCountSubstrings(self, s: str, count: int) -> int:
        result = 0
        n = len(s)
        for unique_chars in range(1, 27):
            window_len = unique_chars * count
            if window_len > n:
                break
            freq = [0] * 26
            exact_count = 0
            for i in range(window_len):
                idx = ord(s[i]) - ord('a')
                freq[idx] += 1
                if freq[idx] == count:
                    exact_count += 1
                elif freq[idx] == count + 1:
                    exact_count -= 1
            if exact_count == unique_chars:
                result += 1
            for i in range(window_len, n):
                idx_add = ord(s[i]) - ord('a')
                idx_remove = ord(s[i - window_len]) - ord('a')
                
                freq[idx_add] += 1
                if freq[idx_add] == count:
                    exact_count += 1
                elif freq[idx_add] == count + 1:
                    exact_count -= 1
                
                freq[idx_remove] -= 1
                if freq[idx_remove] == count:
                    exact_count += 1
                elif freq[idx_remove] == count - 1:
                    exact_count -= 1
                
                if exact_count == unique_chars:
                    result += 1
        return result

The code iterates over possible unique character counts and slides a window of length unique_chars * count. The frequency array freq tracks the occurrences of each character, and exact_count tracks how many characters have frequency exactly count. This allows checking the condition in O(1) per window shift.

Go Solution

func equalCountSubstrings(s string, count int) int {
    result := 0
    n := len(s)
    for uniqueChars := 1; uniqueChars <= 26; uniqueChars++ {
        windowLen := uniqueChars * count
        if windowLen > n {
            break
        }
        freq := [26]int{}
        exactCount := 0
        for i := 0; i < windowLen; i++ {
            idx := s[i] - 'a'
            freq[idx]++
            if freq[idx] == count {
                exactCount++
            } else if freq[idx] == count+1 {
                exactCount--
            }
        }
        if exactCount == uniqueChars {
            result++
        }
        for i := windowLen; i < n; i++ {
            idxAdd := s[i] - 'a'
            idxRemove := s[i-windowLen] - 'a'
            
            freq[idxAdd]++
            if freq[idxAdd] == count {
                exactCount++
            } else if freq[idxAdd] == count+1 {
                exactCount--
            }
            
            freq[idxRemove]--
            if freq[idxRemove] == count {
                exactCount++
            } else if freq[idxRemove] == count-1 {
                exactCount--
            }
            
            if exactCount == uniqueChars {
                result++
            }
        }
    }
    return result
}

Go-specific notes: We use a fixed-size array [26]int instead of a map for frequency tracking. Arithmetic on byte values ensures correct indexing. No nil handling is needed because slices are never nil.

Worked Examples

Example 1: s = "aaabcbbcc", count = 3

Step Window Frequencies Exact Count Result
unique_chars = 1 "aaa" a=3 1 1
unique_chars = 2 "bcbbcc" b=3, c=3 2 2
unique_chars = 3 "aaabcbbcc" a=3, b=3, c=3 3 3

Final answer: 3

Example 2: s = "abcd", count = 2

All substrings too short to satisfy count = 2. Result: 0

Example 3: s = "a", count = 5

Single character, cannot reach count. Result: 0

Complexity Analysis

Measure Complexity Explanation
Time O(26 * n) = O(n) Loop over 26 possible unique character counts, slide window over n-length string
Space O(26) = O(1) Fixed-size frequency array

The algorithm is efficient because the 26 factor is constant, and each window operation is O(1).

Test Cases

sol = Solution()

# Provided examples
assert sol.equalCountSubstrings("aaabcbbcc", 3) == 3  # multiple valid substrings
assert sol.equalCountSubstrings("abcd", 2) == 0       # no substring satisfies count
assert sol.equalCountSubstrings("a", 5) == 0          # count larger than string length

# Edge cases
assert sol.equalCountSubstrings("aaa", 1) == 3        # every single character is a valid substring
assert sol.equalCountSubstrings("abcabcabc", 3) == 1  # only full string matches
assert sol.equalCountSubstrings("aabbcc", 2) == 1     # full string only
assert sol.equalCountSubstrings("abc", 1) == 6