LeetCode 2168 - Unique Substrings With Equal Digit Frequency

The problem gives us a string s consisting only of digits from '0' to '9'. We must count how many distinct substrings satisfy a special condition: Every digit that appears in the substring must appear the same number of times. A substring is a contiguous portion of the string.

LeetCode Problem 2168

Difficulty: 🟡 Medium
Topics: Hash Table, String, Rolling Hash, Counting, Hash Function

Solution

Problem Understanding

The problem gives us a string s consisting only of digits from '0' to '9'. We must count how many distinct substrings satisfy a special condition:

Every digit that appears in the substring must appear the same number of times.

A substring is a contiguous portion of the string. The substring may contain one digit or several different digits. The important requirement is that all digits present in that substring must have identical frequencies.

For example, consider the substring "1212":

  • '1' appears 2 times
  • '2' appears 2 times

Since both digits occur equally often, the substring is valid.

Now consider "122":

  • '1' appears 1 time
  • '2' appears 2 times

The frequencies are not equal, so this substring is invalid.

Another important detail is uniqueness. Even if the same valid substring appears multiple times in the string, it should only be counted once.

For example, in "1212":

  • "12" appears twice
  • It is counted only once

The constraints are relatively small:

  • 1 <= s.length <= 1000
  • s contains only digits

Since the string length is at most 1000, an O(n^2) solution is usually acceptable, while O(n^3) may become too slow. The fact that there are only 10 possible digits is extremely important and enables efficient frequency checking.

Several edge cases are important:

  • A single character substring is always valid because only one digit appears.
  • Repeated substrings must only be counted once.
  • Substrings with many repeated digits may produce duplicate valid patterns.
  • Strings containing all identical digits must still work correctly.
  • Strings with many different digits require careful frequency comparison.

Approaches

Brute Force Approach

The brute force solution generates every possible substring and checks whether all appearing digits have the same frequency.

For each substring:

  1. Count the occurrences of each digit.
  2. Extract the non-zero frequencies.
  3. Verify that all non-zero frequencies are equal.
  4. Store the substring in a set if valid.

There are O(n^2) substrings. Checking each substring naively requires scanning the substring again to compute frequencies, which costs O(n) time.

This leads to an overall complexity of O(n^3).

Although n = 1000 is not enormous, O(n^3) can become too slow because there are roughly one million substrings and each may require another scan.

Optimal Approach

The key observation is that there are only 10 possible digits.

Instead of recomputing frequencies from scratch for every substring, we can expand substrings incrementally.

For each starting index:

  1. Maintain a frequency array of size 10.
  2. Extend the substring one character at a time.
  3. Update the frequency of the newly added digit.
  4. Check whether all non-zero frequencies are equal.

Since checking 10 digits is constant time, each substring can be validated efficiently.

To avoid counting duplicates multiple times, we store valid substrings in a hash set.

Instead of storing the actual substring content directly, we use a rolling hash. This avoids repeatedly creating substring objects and improves efficiency.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n³) O(n²) Recomputes frequencies for every substring
Optimal O(n²) O(n²) Incremental counting with rolling hash and constant-size frequency checks

Algorithm Walkthrough

Step 1: Initialize a set for unique valid substrings

We use a hash set to ensure duplicate substrings are counted only once.

Instead of storing raw substring strings, we store rolling hash values.

Step 2: Iterate over every starting position

For each index start:

  • Create a fresh frequency array of size 10
  • Initialize rolling hash variables

This allows us to build substrings incrementally.

Step 3: Extend the substring character by character

For every ending index end starting from start:

  • Convert the digit character into an integer
  • Increment its frequency
  • Update the rolling hash

The rolling hash lets us uniquely represent the substring efficiently.

Step 4: Check whether frequencies are equal

We scan the frequency array:

  • Ignore zero frequencies
  • Record the first non-zero frequency
  • Ensure all other non-zero frequencies match it

If they do, the substring is valid.

Step 5: Add valid substrings to the set

When a substring satisfies the condition:

  • Insert its rolling hash into the set

Since sets automatically remove duplicates, repeated substrings are counted only once.

Step 6: Return the set size

The final answer is simply the number of unique valid substrings stored in the set.

Why it works

The algorithm examines every possible substring exactly once. For each substring, it maintains exact digit frequencies incrementally, so the equality check is always accurate.

A substring is added only if all non-zero digit counts are equal. Because the set stores unique representations of substrings, duplicates are automatically removed.

Therefore, every valid unique substring is counted exactly once.

Python Solution

class Solution:
    def equalDigitFrequency(self, s: str) -> int:
        unique_hashes = set()

        base = 11
        mod = 10**9 + 7

        n = len(s)

        for start in range(n):
            freq = [0] * 10
            rolling_hash = 0

            for end in range(start, n):
                digit = int(s[end])

                freq[digit] += 1

                rolling_hash = (
                    rolling_hash * base + digit + 1
                ) % mod

                target = 0
                valid = True

                for count in freq:
                    if count == 0:
                        continue

                    if target == 0:
                        target = count
                    elif count != target:
                        valid = False
                        break

                if valid:
                    unique_hashes.add(rolling_hash)

        return len(unique_hashes)

The implementation uses two nested loops to enumerate all substrings.

The outer loop fixes the starting position. For each start index, we initialize:

  • A frequency array for digits
  • A rolling hash value

The inner loop progressively extends the substring by one character at a time. This incremental construction is important because it avoids recomputing frequencies from scratch.

Each time a new digit is added:

freq[digit] += 1

The rolling hash is updated using polynomial hashing:

rolling_hash = (
    rolling_hash * base + digit + 1
) % mod

Adding 1 avoids ambiguity involving leading zeros.

Next, the code checks whether all non-zero frequencies are identical. The first non-zero count becomes the reference frequency. Every other non-zero count must match it.

If the substring is valid, its hash is inserted into the set. Since sets only keep unique values, repeated substrings are automatically deduplicated.

Finally, the number of unique valid substrings is returned.

Go Solution

package main

func equalDigitFrequency(s string) int {
    uniqueHashes := make(map[int]bool)

    const base = 11
    const mod = 1000000007

    n := len(s)

    for start := 0; start < n; start++ {
        freq := make([]int, 10)
        rollingHash := 0

        for end := start; end < n; end++ {
            digit := int(s[end] - '0')

            freq[digit]++

            rollingHash = (rollingHash*base + digit + 1) % mod

            target := 0
            valid := true

            for _, count := range freq {
                if count == 0 {
                    continue
                }

                if target == 0 {
                    target = count
                } else if count != target {
                    valid = false
                    break
                }
            }

            if valid {
                uniqueHashes[rollingHash] = true
            }
        }
    }

    return len(uniqueHashes)
}

The Go implementation follows the same algorithmic structure as the Python version.

A map[int]bool is used instead of a Python set. Frequency counting uses a slice of length 10.

Go strings are byte indexed, so digit conversion is performed with:

digit := int(s[end] - '0')

Integer overflow is not an issue because all hash calculations are performed modulo 1e9+7.

Worked Examples

Example 1

Input: s = "1212"

We enumerate substrings starting from each index.

Start = 0

Substring Frequencies Valid Added
"1" {1:1} Yes Yes
"12" {1:1,2:1} Yes Yes
"121" {1:2,2:1} No No
"1212" {1:2,2:2} Yes Yes

Start = 1

Substring Frequencies Valid Added
"2" {2:1} Yes Yes
"21" {2:1,1:1} Yes Yes
"212" {2:2,1:1} No No

Start = 2

Substring Frequencies Valid Added
"1" duplicate Yes Already counted
"12" duplicate Yes Already counted

Start = 3

Substring Frequencies Valid Added
"2" duplicate Yes Already counted

Unique valid substrings:

"1", "2", "12", "21", "1212"

Answer:

5

Example 2

Input: s = "12321"

Start = 0

Substring Frequencies Valid
"1" {1:1} Yes
"12" {1:1,2:1} Yes
"123" {1:1,2:1,3:1} Yes
"1232" {1:1,2:2,3:1} No
"12321" {1:2,2:2,3:1} No

Start = 1

Substring Frequencies Valid
"2" Yes
"23" Yes
"232" No
"2321" No

Start = 2

Substring Frequencies Valid
"3" Yes
"32" Yes
"321" Yes

Start = 3

Substring Frequencies Valid
"2" duplicate
"21" Yes

Start = 4

Substring Frequencies Valid
"1" duplicate

Unique valid substrings:

"1", "2", "3", "12", "23", "32", "21", "123", "321"

Answer:

9

Complexity Analysis

Measure Complexity Explanation
Time O(n²) Every substring is processed once, frequency check is constant time
Space O(n²) Hash set may store many unique substrings

There are O(n²) substrings in total. For each substring, we update frequencies in constant time and scan a fixed-size frequency array of length 10. Since 10 is constant, each validation step is effectively O(1).

The set can contain up to O(n²) unique substrings in the worst case.

Test Cases

sol = Solution()

assert sol.equalDigitFrequency("1212") == 5  # basic repeated pattern
assert sol.equalDigitFrequency("12321") == 9  # multiple unique valid substrings

assert sol.equalDigitFrequency("1") == 1  # single character
assert sol.equalDigitFrequency("1111") == 4  # only repeated single digit substrings
assert sol.equalDigitFrequency("12") == 3  # "1", "2", "12"

assert sol.equalDigitFrequency("000") == 3  # substrings with zeros
assert sol.equalDigitFrequency("1010") == 5  # alternating digits

assert sol.equalDigitFrequency("1234") == 10  # all substrings valid
assert sol.equalDigitFrequency("1221") == 5  # symmetric frequencies

assert sol.equalDigitFrequency("999999") == 6  # repeated same digit
assert sol.equalDigitFrequency("121212") == 8  # many duplicate substrings

assert sol.equalDigitFrequency("9876543210") == 55  # all unique digits

Test Summary

Test Why
"1212" Verifies duplicate substring handling
"12321" Checks mixed valid and invalid substrings
"1" Smallest possible input
"1111" Only one digit repeated
"12" Simple two-digit case
"000" Ensures zero digits work correctly
"1010" Alternating pattern validation
"1234" Every substring is valid
"1221" Frequency mismatch detection
"999999" Large repetition of one digit
"121212" Many repeated valid substrings
"9876543210" Maximum digit diversity

Edge Cases

One important edge case is a string containing only one unique digit, such as "111111". Every substring consists entirely of the same digit, so every substring is valid. However, many substrings are duplicates. A buggy implementation might count every occurrence separately instead of enforcing uniqueness. The hash set prevents this issue because identical substrings produce the same stored representation.

Another important case involves substrings where frequencies become unequal after initially matching. For example, "121" starts as valid for "1" and "12", but becomes invalid once the second '1' appears. Incremental frequency tracking handles this correctly because validation is performed after every extension of the substring.

A third edge case is handling zeros correctly. Since digits range from '0' to '9', the implementation must treat zero like any other digit. The rolling hash uses digit + 1 rather than digit directly to avoid ambiguity involving leading zeros in the hash construction. This ensures substrings like "0" and "00" are represented distinctly.

Another subtle case is duplicate valid substrings appearing at different positions. In "1212", the substring "12" appears twice. Without deduplication, the algorithm would incorrectly count it twice. The set guarantees each valid substring contributes only once to the final answer.