LeetCode 2168 - Unique Substrings With Equal Digit Frequency
The problem gives us a string s consisting only of digits from '0' to '9'. We must count how many distinct substrings satisfy a special condition: Every digit that appears in the substring must appear the same number of times. A substring is a contiguous portion of the string.
Difficulty: 🟡 Medium
Topics: Hash Table, String, Rolling Hash, Counting, Hash Function
Solution
Problem Understanding
The problem gives us a string s consisting only of digits from '0' to '9'. We must count how many distinct substrings satisfy a special condition:
Every digit that appears in the substring must appear the same number of times.
A substring is a contiguous portion of the string. The substring may contain one digit or several different digits. The important requirement is that all digits present in that substring must have identical frequencies.
For example, consider the substring "1212":
'1'appears 2 times'2'appears 2 times
Since both digits occur equally often, the substring is valid.
Now consider "122":
'1'appears 1 time'2'appears 2 times
The frequencies are not equal, so this substring is invalid.
Another important detail is uniqueness. Even if the same valid substring appears multiple times in the string, it should only be counted once.
For example, in "1212":
"12"appears twice- It is counted only once
The constraints are relatively small:
1 <= s.length <= 1000scontains only digits
Since the string length is at most 1000, an O(n^2) solution is usually acceptable, while O(n^3) may become too slow. The fact that there are only 10 possible digits is extremely important and enables efficient frequency checking.
Several edge cases are important:
- A single character substring is always valid because only one digit appears.
- Repeated substrings must only be counted once.
- Substrings with many repeated digits may produce duplicate valid patterns.
- Strings containing all identical digits must still work correctly.
- Strings with many different digits require careful frequency comparison.
Approaches
Brute Force Approach
The brute force solution generates every possible substring and checks whether all appearing digits have the same frequency.
For each substring:
- Count the occurrences of each digit.
- Extract the non-zero frequencies.
- Verify that all non-zero frequencies are equal.
- Store the substring in a set if valid.
There are O(n^2) substrings. Checking each substring naively requires scanning the substring again to compute frequencies, which costs O(n) time.
This leads to an overall complexity of O(n^3).
Although n = 1000 is not enormous, O(n^3) can become too slow because there are roughly one million substrings and each may require another scan.
Optimal Approach
The key observation is that there are only 10 possible digits.
Instead of recomputing frequencies from scratch for every substring, we can expand substrings incrementally.
For each starting index:
- Maintain a frequency array of size 10.
- Extend the substring one character at a time.
- Update the frequency of the newly added digit.
- Check whether all non-zero frequencies are equal.
Since checking 10 digits is constant time, each substring can be validated efficiently.
To avoid counting duplicates multiple times, we store valid substrings in a hash set.
Instead of storing the actual substring content directly, we use a rolling hash. This avoids repeatedly creating substring objects and improves efficiency.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n³) | O(n²) | Recomputes frequencies for every substring |
| Optimal | O(n²) | O(n²) | Incremental counting with rolling hash and constant-size frequency checks |
Algorithm Walkthrough
Step 1: Initialize a set for unique valid substrings
We use a hash set to ensure duplicate substrings are counted only once.
Instead of storing raw substring strings, we store rolling hash values.
Step 2: Iterate over every starting position
For each index start:
- Create a fresh frequency array of size 10
- Initialize rolling hash variables
This allows us to build substrings incrementally.
Step 3: Extend the substring character by character
For every ending index end starting from start:
- Convert the digit character into an integer
- Increment its frequency
- Update the rolling hash
The rolling hash lets us uniquely represent the substring efficiently.
Step 4: Check whether frequencies are equal
We scan the frequency array:
- Ignore zero frequencies
- Record the first non-zero frequency
- Ensure all other non-zero frequencies match it
If they do, the substring is valid.
Step 5: Add valid substrings to the set
When a substring satisfies the condition:
- Insert its rolling hash into the set
Since sets automatically remove duplicates, repeated substrings are counted only once.
Step 6: Return the set size
The final answer is simply the number of unique valid substrings stored in the set.
Why it works
The algorithm examines every possible substring exactly once. For each substring, it maintains exact digit frequencies incrementally, so the equality check is always accurate.
A substring is added only if all non-zero digit counts are equal. Because the set stores unique representations of substrings, duplicates are automatically removed.
Therefore, every valid unique substring is counted exactly once.
Python Solution
class Solution:
def equalDigitFrequency(self, s: str) -> int:
unique_hashes = set()
base = 11
mod = 10**9 + 7
n = len(s)
for start in range(n):
freq = [0] * 10
rolling_hash = 0
for end in range(start, n):
digit = int(s[end])
freq[digit] += 1
rolling_hash = (
rolling_hash * base + digit + 1
) % mod
target = 0
valid = True
for count in freq:
if count == 0:
continue
if target == 0:
target = count
elif count != target:
valid = False
break
if valid:
unique_hashes.add(rolling_hash)
return len(unique_hashes)
The implementation uses two nested loops to enumerate all substrings.
The outer loop fixes the starting position. For each start index, we initialize:
- A frequency array for digits
- A rolling hash value
The inner loop progressively extends the substring by one character at a time. This incremental construction is important because it avoids recomputing frequencies from scratch.
Each time a new digit is added:
freq[digit] += 1
The rolling hash is updated using polynomial hashing:
rolling_hash = (
rolling_hash * base + digit + 1
) % mod
Adding 1 avoids ambiguity involving leading zeros.
Next, the code checks whether all non-zero frequencies are identical. The first non-zero count becomes the reference frequency. Every other non-zero count must match it.
If the substring is valid, its hash is inserted into the set. Since sets only keep unique values, repeated substrings are automatically deduplicated.
Finally, the number of unique valid substrings is returned.
Go Solution
package main
func equalDigitFrequency(s string) int {
uniqueHashes := make(map[int]bool)
const base = 11
const mod = 1000000007
n := len(s)
for start := 0; start < n; start++ {
freq := make([]int, 10)
rollingHash := 0
for end := start; end < n; end++ {
digit := int(s[end] - '0')
freq[digit]++
rollingHash = (rollingHash*base + digit + 1) % mod
target := 0
valid := true
for _, count := range freq {
if count == 0 {
continue
}
if target == 0 {
target = count
} else if count != target {
valid = false
break
}
}
if valid {
uniqueHashes[rollingHash] = true
}
}
}
return len(uniqueHashes)
}
The Go implementation follows the same algorithmic structure as the Python version.
A map[int]bool is used instead of a Python set. Frequency counting uses a slice of length 10.
Go strings are byte indexed, so digit conversion is performed with:
digit := int(s[end] - '0')
Integer overflow is not an issue because all hash calculations are performed modulo 1e9+7.
Worked Examples
Example 1
Input: s = "1212"
We enumerate substrings starting from each index.
Start = 0
| Substring | Frequencies | Valid | Added |
|---|---|---|---|
"1" |
{1:1} |
Yes | Yes |
"12" |
{1:1,2:1} |
Yes | Yes |
"121" |
{1:2,2:1} |
No | No |
"1212" |
{1:2,2:2} |
Yes | Yes |
Start = 1
| Substring | Frequencies | Valid | Added |
|---|---|---|---|
"2" |
{2:1} |
Yes | Yes |
"21" |
{2:1,1:1} |
Yes | Yes |
"212" |
{2:2,1:1} |
No | No |
Start = 2
| Substring | Frequencies | Valid | Added |
|---|---|---|---|
"1" |
duplicate | Yes | Already counted |
"12" |
duplicate | Yes | Already counted |
Start = 3
| Substring | Frequencies | Valid | Added |
|---|---|---|---|
"2" |
duplicate | Yes | Already counted |
Unique valid substrings:
"1", "2", "12", "21", "1212"
Answer:
5
Example 2
Input: s = "12321"
Start = 0
| Substring | Frequencies | Valid |
|---|---|---|
"1" |
{1:1} |
Yes |
"12" |
{1:1,2:1} |
Yes |
"123" |
{1:1,2:1,3:1} |
Yes |
"1232" |
{1:1,2:2,3:1} |
No |
"12321" |
{1:2,2:2,3:1} |
No |
Start = 1
| Substring | Frequencies | Valid |
|---|---|---|
"2" |
Yes | |
"23" |
Yes | |
"232" |
No | |
"2321" |
No |
Start = 2
| Substring | Frequencies | Valid |
|---|---|---|
"3" |
Yes | |
"32" |
Yes | |
"321" |
Yes |
Start = 3
| Substring | Frequencies | Valid |
|---|---|---|
"2" |
duplicate | |
"21" |
Yes |
Start = 4
| Substring | Frequencies | Valid |
|---|---|---|
"1" |
duplicate |
Unique valid substrings:
"1", "2", "3", "12", "23", "32", "21", "123", "321"
Answer:
9
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n²) | Every substring is processed once, frequency check is constant time |
| Space | O(n²) | Hash set may store many unique substrings |
There are O(n²) substrings in total. For each substring, we update frequencies in constant time and scan a fixed-size frequency array of length 10. Since 10 is constant, each validation step is effectively O(1).
The set can contain up to O(n²) unique substrings in the worst case.
Test Cases
sol = Solution()
assert sol.equalDigitFrequency("1212") == 5 # basic repeated pattern
assert sol.equalDigitFrequency("12321") == 9 # multiple unique valid substrings
assert sol.equalDigitFrequency("1") == 1 # single character
assert sol.equalDigitFrequency("1111") == 4 # only repeated single digit substrings
assert sol.equalDigitFrequency("12") == 3 # "1", "2", "12"
assert sol.equalDigitFrequency("000") == 3 # substrings with zeros
assert sol.equalDigitFrequency("1010") == 5 # alternating digits
assert sol.equalDigitFrequency("1234") == 10 # all substrings valid
assert sol.equalDigitFrequency("1221") == 5 # symmetric frequencies
assert sol.equalDigitFrequency("999999") == 6 # repeated same digit
assert sol.equalDigitFrequency("121212") == 8 # many duplicate substrings
assert sol.equalDigitFrequency("9876543210") == 55 # all unique digits
Test Summary
| Test | Why |
|---|---|
"1212" |
Verifies duplicate substring handling |
"12321" |
Checks mixed valid and invalid substrings |
"1" |
Smallest possible input |
"1111" |
Only one digit repeated |
"12" |
Simple two-digit case |
"000" |
Ensures zero digits work correctly |
"1010" |
Alternating pattern validation |
"1234" |
Every substring is valid |
"1221" |
Frequency mismatch detection |
"999999" |
Large repetition of one digit |
"121212" |
Many repeated valid substrings |
"9876543210" |
Maximum digit diversity |
Edge Cases
One important edge case is a string containing only one unique digit, such as "111111". Every substring consists entirely of the same digit, so every substring is valid. However, many substrings are duplicates. A buggy implementation might count every occurrence separately instead of enforcing uniqueness. The hash set prevents this issue because identical substrings produce the same stored representation.
Another important case involves substrings where frequencies become unequal after initially matching. For example, "121" starts as valid for "1" and "12", but becomes invalid once the second '1' appears. Incremental frequency tracking handles this correctly because validation is performed after every extension of the substring.
A third edge case is handling zeros correctly. Since digits range from '0' to '9', the implementation must treat zero like any other digit. The rolling hash uses digit + 1 rather than digit directly to avoid ambiguity involving leading zeros in the hash construction. This ensures substrings like "0" and "00" are represented distinctly.
Another subtle case is duplicate valid substrings appearing at different positions. In "1212", the substring "12" appears twice. Without deduplication, the algorithm would incorrectly count it twice. The set guarantees each valid substring contributes only once to the final answer.