LeetCode 3859 - Count Subarrays With K Distinct Integers
This problem asks us to count subarrays of a given integer array nums such that each subarray contains exactly k distinct integers, and each distinct integer in the subarray appears at least m times.
Difficulty: 🔴 Hard
Topics: Array, Hash Table, Sliding Window, Counting
Solution
Problem Understanding
This problem asks us to count subarrays of a given integer array nums such that each subarray contains exactly k distinct integers, and each distinct integer in the subarray appears at least m times. In other words, for a subarray to be valid, it must satisfy two constraints: a cardinality constraint (exactly k distinct numbers) and a frequency constraint (each distinct number appears ≥ m times).
The input nums is an array of integers, with a length up to 10^5. Each element is also up to 10^5. The integers k and m are positive and bounded by the array length. The output is a single integer representing the number of valid subarrays.
Important edge cases include arrays where k > length of nums or m is larger than the frequency of any element, as these make valid subarrays impossible. Another subtle case is when nums contains repeated elements, because a naive approach might miss subarrays that overlap or fail to count them multiple times if needed.
Approaches
The brute-force approach would involve generating all possible subarrays and counting the distinct integers and their frequencies for each subarray. While this guarantees correctness, it is infeasible because the number of subarrays of an array of length n is O(n^2), and checking the frequency for each subarray adds another O(n) in the worst case. This leads to an O(n^3) algorithm, which will not work for n up to 10^5.
The key insight for an optimal approach is that we can use a sliding window with a frequency hashmap to efficiently maintain the count of distinct integers and their frequencies in the current window. By carefully expanding and contracting the window, we can count subarrays satisfying exactly k distinct numbers, all appearing at least m times, without generating every subarray explicitly. The problem can be reduced to counting windows where the frequency condition is satisfied, leveraging two pointers and a hashmap to track counts.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^3) | O(n) | Generate all subarrays, count distinct numbers and frequencies for each |
| Optimal Sliding Window | O(n) amortized | O(n) | Use two pointers and hashmap to maintain counts; count valid subarrays efficiently |
Algorithm Walkthrough
- Initialize two pointers
leftandrightto represent the sliding window. Create a hashmapfreqto store the count of each integer in the current window, and a variabledistinctValidto track how many integers currently satisfy the≥ mfrequency requirement. - Move the
rightpointer through the array, addingnums[right]tofreq. If adding this element makes its frequency exactlym, incrementdistinctValidbecause this integer now satisfies the frequency requirement. - While the window contains more than
kdistinct integers, incrementleftto shrink the window. If removingnums[left]drops its frequency belowm, decrementdistinctValid. - Whenever the window has exactly
kdistinct integers and all of them appear at leastmtimes (distinctValid == k), count the number of valid subarrays ending atright. All subarrays starting fromleftup to the currentleftBoundare valid because increasingleftfurther would violate the distinct count or frequency condition. - Continue expanding
rightand adjustingleftuntil the end of the array. Sum up the valid subarrays found.
Why it works: The sliding window maintains the invariant that all elements inside satisfy the minimum frequency condition and that distinct counts are tracked. By counting subarrays dynamically as the window expands, we avoid recomputing frequencies for overlapping windows, ensuring correctness and efficiency.
Python Solution
class Solution:
def countSubarrays(self, nums: list[int], k: int, m: int) -> int:
from collections import defaultdict
freq = defaultdict(int)
left = 0
count = 0
distinctValid = 0
n = len(nums)
for right in range(n):
freq[nums[right]] += 1
if freq[nums[right]] == m:
distinctValid += 1
while len(freq) > k:
if freq[nums[left]] == m:
distinctValid -= 1
freq[nums[left]] -= 1
if freq[nums[left]] == 0:
del freq[nums[left]]
left += 1
# Check if current window is valid
if len(freq) == k and distinctValid == k:
tempLeft = left
while tempLeft <= right and all(v >= m for v in freq.values()):
count += 1
freq[nums[tempLeft]] -= 1
if freq[nums[tempLeft]] == m - 1:
distinctValid -= 1
if freq[nums[tempLeft]] == 0:
del freq[nums[tempLeft]]
tempLeft += 1
# Restore freq and distinctValid
for i in range(left, tempLeft):
freq[nums[i]] += 1
if freq[nums[i]] == m:
distinctValid += 1
return count
The Python implementation maintains a sliding window with two pointers. freq tracks frequencies, and distinctValid ensures each distinct integer meets the minimum frequency. When a valid window is found, the inner loop counts all subarrays by shrinking from the left while still satisfying the frequency requirement, then restores the state for the next iteration.
Go Solution
func countSubarrays(nums []int, k int, m int) int64 {
freq := make(map[int]int)
var count int64
left := 0
distinctValid := 0
for right := 0; right < len(nums); right++ {
freq[nums[right]]++
if freq[nums[right]] == m {
distinctValid++
}
for len(freq) > k {
if freq[nums[left]] == m {
distinctValid--
}
freq[nums[left]]--
if freq[nums[left]] == 0 {
delete(freq, nums[left])
}
left++
}
if len(freq) == k && distinctValid == k {
tempLeft := left
for tempLeft <= right {
valid := true
for _, v := range freq {
if v < m {
valid = false
break
}
}
if !valid {
break
}
count++
if freq[nums[tempLeft]] == m {
distinctValid--
}
freq[nums[tempLeft]]--
if freq[nums[tempLeft]] == 0 {
delete(freq, nums[tempLeft])
}
tempLeft++
}
for i := left; i < tempLeft; i++ {
freq[nums[i]]++
if freq[nums[i]] == m {
distinctValid++
}
}
}
}
return count
}
The Go version mirrors the Python logic. Go requires explicit deletion from the map and careful handling of integer types to avoid overflow. Slice indexing is straightforward and maps replace Python's defaultdict.
Worked Examples
Example 1: nums = [1,2,1,2,2], k = 2, m = 2
| Right | Window | freq | distinctValid | Valid Subarrays Added |
|---|---|---|---|---|
| 0 | [1] | {1:1} | 0 | 0 |
| 1 | [1,2] | {1:1,2:1} | 0 | 0 |
| 2 | [1,2,1] | {1:2,2:1} | 1 | 0 |
| 3 | [1,2,1,2] | {1:2,2:2} | 2 | +1 |
| 4 | [1,2,1,2,2] | {1:2,2:3} | 2 | +1 |
Total count = 2
Example 2: nums = [3,1,2,4], k = 2, m = 1
| Right | Window | freq | distinctValid | Valid Subarrays Added |
|---|---|---|---|---|
| 0 | [3] | {3:1} | 1 | 0 |
| 1 | [3,1] | {3:1,1:1} | 2 | +1 |
| 2 | [1,2] | {1:1,2:1} | 2 | +1 |
| 3 | [2,4] | {2:1,4:1} | 2 | +1 |
Total count = 3
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) amortized | Each element is added and removed at most twice from the sliding window and frequency map |
| Space | O(n) | Hashmap stores counts of elements in the current window; at worst all elements are distinct |
The amortized time is O(n)