LeetCode 992 - Subarrays with K Different Integers
The problem asks us to count how many contiguous subarrays contain exactly k distinct integers. We are given: - An integer array nums - An integer k A subarray is any continuous portion of the array.
Difficulty: 🔴 Hard
Topics: Array, Hash Table, Sliding Window, Counting
Solution
Problem Understanding
The problem asks us to count how many contiguous subarrays contain exactly k distinct integers.
We are given:
- An integer array
nums - An integer
k
A subarray is any continuous portion of the array. For each possible subarray, we need to determine whether it contains exactly k unique values. If it does, it contributes 1 to the answer.
For example, in:
nums = [1,2,1,2,3], k = 2
The subarray [1,2,1] is valid because it contains exactly two distinct integers, 1 and 2.
However:
[1,2,1,2,3]is invalid because it contains three distinct integers[1]is invalid because it contains only one distinct integer
The constraints are important:
nums.lengthcan be as large as2 * 10^4- A brute force solution that checks every subarray individually would be too slow if it requires recomputing distinct counts repeatedly
Since the number of possible subarrays is O(n²), we need something significantly better than quadratic time.
Several edge cases are important:
- Arrays where all elements are identical
- Arrays where every element is unique
k = 1k = nums.length- Very small arrays
- Situations where expanding the window frequently exceeds
kdistinct values
These cases can expose bugs in window shrinking logic or frequency counting.
Approaches
Brute Force Approach
The most direct solution is to generate every possible subarray and count how many distinct integers it contains.
For every starting index i, we extend the subarray to every ending index j. While expanding, we maintain a set or frequency map to track distinct elements.
Whenever the number of distinct elements becomes exactly k, we increment the answer.
This approach is correct because it explicitly checks every subarray. Nothing is missed, and every valid subarray is counted.
However, the time complexity is too large. There are O(n²) subarrays, and although maintaining frequencies can make each extension efficient, the total runtime still becomes quadratic.
With n = 20000, an O(n²) solution is not practical.
Optimal Approach, Sliding Window with At Most K Distinct
The key insight is that counting subarrays with exactly k distinct integers directly is difficult, but counting subarrays with at most k distinct integers is much easier.
We use the identity:
exactly(k) = atMost(k) - atMost(k - 1)
Why does this work?
atMost(k)counts all subarrays containing up tokdistinct integersatMost(k - 1)counts all subarrays containing fewer thankdistinct integers- Their difference leaves only the subarrays containing exactly
kdistinct integers
The sliding window technique works efficiently for the atMost(k) calculation because:
- We expand the right pointer
- We shrink the left pointer whenever distinct values exceed
k - At each position, we can count how many valid subarrays end at the current right index
This transforms the problem into a linear-time solution.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | Enumerates every subarray |
| Optimal | O(n) | O(n) | Sliding window using atMost(k) trick |
Algorithm Walkthrough
We first define a helper function:
countAtMost(k)
This function returns the number of subarrays containing at most k distinct integers.
Then:
answer = countAtMost(k) - countAtMost(k - 1)
Step-by-step process
- Initialize a sliding window.
We maintain:
leftpointerrightpointer- frequency map
- number of distinct integers
The window always represents a valid subarray with at most k distinct integers.
2. Expand the window by moving right.
For each new element:
- Add it to the frequency map
- If its frequency becomes
1, we discovered a new distinct integer
- If distinct integers exceed
k, shrink the window.
While distinct count is greater than k:
- Remove
nums[left] - Decrease its frequency
- If frequency becomes
0, remove that distinct integer - Move
leftforward
- Count valid subarrays ending at
right.
Once the window is valid again:
window length = right - left + 1
Every subarray ending at right and starting anywhere between left and right is valid.
Therefore:
result += right - left + 1
- Repeat until the entire array is processed.
- Compute:
exactly(k) = atMost(k) - atMost(k - 1)
Why it works
The sliding window always maintains the invariant that the current window contains at most k distinct integers.
For every position right, all subarrays ending at right and starting between left and right are valid. Since the window is minimal after shrinking, every shorter suffix is also valid.
Subtracting atMost(k - 1) removes all subarrays with fewer than k distinct integers, leaving only those with exactly k.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
def count_at_most(max_distinct: int) -> int:
left = 0
result = 0
frequency = defaultdict(int)
distinct_count = 0
for right in range(len(nums)):
value = nums[right]
if frequency[value] == 0:
distinct_count += 1
frequency[value] += 1
while distinct_count > max_distinct:
left_value = nums[left]
frequency[left_value] -= 1
if frequency[left_value] == 0:
distinct_count -= 1
left += 1
result += right - left + 1
return result
return count_at_most(k) - count_at_most(k - 1)
The implementation follows the exact algorithm described earlier.
The helper function count_at_most() performs the sliding window logic. The frequency map tracks how many times each integer appears inside the current window.
Whenever a new distinct integer enters the window, distinct_count increases. If the window becomes invalid, meaning distinct integers exceed the allowed limit, we repeatedly shrink the left boundary until the constraint is restored.
The key counting step is:
result += right - left + 1
This works because every suffix of the valid window ending at right is also valid.
Finally, the main method applies the formula:
count_at_most(k) - count_at_most(k - 1)
to isolate subarrays containing exactly k distinct integers.
Go Solution
func subarraysWithKDistinct(nums []int, k int) int {
countAtMost := func(maxDistinct int) int {
left := 0
result := 0
frequency := make(map[int]int)
distinctCount := 0
for right := 0; right < len(nums); right++ {
value := nums[right]
if frequency[value] == 0 {
distinctCount++
}
frequency[value]++
for distinctCount > maxDistinct {
leftValue := nums[left]
frequency[leftValue]--
if frequency[leftValue] == 0 {
distinctCount--
}
left++
}
result += right - left + 1
}
return result
}
return countAtMost(k) - countAtMost(k-1)
}
The Go implementation mirrors the Python version closely.
Instead of Python's defaultdict, Go uses a standard map:
frequency := make(map[int]int)
Go maps return the zero value for missing keys, which makes frequency tracking straightforward.
No special handling for nil or empty slices is required because the constraints guarantee at least one element. Integer overflow is also not an issue because the maximum number of subarrays is within Go's standard integer range for the given constraints.
Worked Examples
Example 1
nums = [1,2,1,2,3]
k = 2
We compute:
exactly(2) = atMost(2) - atMost(1)
Computing atMost(2)
| Right | Value | Window | Distinct | Added Count | Total |
|---|---|---|---|---|---|
| 0 | 1 | [1] | 1 | 1 | 1 |
| 1 | 2 | [1,2] | 2 | 2 | 3 |
| 2 | 1 | [1,2,1] | 2 | 3 | 6 |
| 3 | 2 | [1,2,1,2] | 2 | 4 | 10 |
| 4 | 3 | shrink to [2,3] | 2 | 2 | 12 |
Result:
atMost(2) = 12
Computing atMost(1)
| Right | Value | Window | Distinct | Added Count | Total |
|---|---|---|---|---|---|
| 0 | 1 | [1] | 1 | 1 | 1 |
| 1 | 2 | [2] | 1 | 1 | 2 |
| 2 | 1 | [1] | 1 | 1 | 3 |
| 3 | 2 | [2] | 1 | 1 | 4 |
| 4 | 3 | [3] | 1 | 1 | 5 |
Result:
atMost(1) = 5
Final answer:
12 - 5 = 7
Example 2
nums = [1,2,1,3,4]
k = 3
We compute:
exactly(3) = atMost(3) - atMost(2)
Computing atMost(3)
| Right | Value | Window | Distinct | Added Count | Total |
|---|---|---|---|---|---|
| 0 | 1 | [1] | 1 | 1 | 1 |
| 1 | 2 | [1,2] | 2 | 2 | 3 |
| 2 | 1 | [1,2,1] | 2 | 3 | 6 |
| 3 | 3 | [1,2,1,3] | 3 | 4 | 10 |
| 4 | 4 | shrink to [1,3,4] | 3 | 3 | 13 |
Result:
atMost(3) = 13
Computing atMost(2)
| Right | Value | Window | Distinct | Added Count | Total |
|---|---|---|---|---|---|
| 0 | 1 | [1] | 1 | 1 | 1 |
| 1 | 2 | [1,2] | 2 | 2 | 3 |
| 2 | 1 | [1,2,1] | 2 | 3 | 6 |
| 3 | 3 | shrink to [1,3] | 2 | 2 | 8 |
| 4 | 4 | shrink to [3,4] | 2 | 2 | 10 |
Final answer:
13 - 10 = 3
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each pointer moves at most n times |
| Space | O(n) | Frequency map may store up to n distinct integers |
The sliding window is efficient because each element enters and leaves the window at most once.
Although there is a nested while loop, the left pointer only moves forward across the entire execution. This means the total number of left pointer movements is bounded by n.
The frequency map stores counts for distinct integers currently inside the window. In the worst case, all elements are unique, requiring O(n) extra space.
Test Cases
from typing import List
class Solution:
def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
from collections import defaultdict
def count_at_most(max_distinct: int) -> int:
left = 0
result = 0
frequency = defaultdict(int)
distinct_count = 0
for right in range(len(nums)):
value = nums[right]
if frequency[value] == 0:
distinct_count += 1
frequency[value] += 1
while distinct_count > max_distinct:
left_value = nums[left]
frequency[left_value] -= 1
if frequency[left_value] == 0:
distinct_count -= 1
left += 1
result += right - left + 1
return result
return count_at_most(k) - count_at_most(k - 1)
solution = Solution()
assert solution.subarraysWithKDistinct([1,2,1,2,3], 2) == 7 # provided example 1
assert solution.subarraysWithKDistinct([1,2,1,3,4], 3) == 3 # provided example 2
assert solution.subarraysWithKDistinct([1,1,1,1], 1) == 10 # all subarrays valid
assert solution.subarraysWithKDistinct([1,2,3,4], 4) == 1 # entire array only
assert solution.subarraysWithKDistinct([1,2,3,4], 1) == 4 # only single-element subarrays
assert solution.subarraysWithKDistinct([1], 1) == 1 # minimum input size
assert solution.subarraysWithKDistinct([1,2,1,2,1], 2) == 10 # many overlapping windows
assert solution.subarraysWithKDistinct([1,2,3], 2) == 2 # simple unique array
assert solution.subarraysWithKDistinct([2,2,2,2], 1) == 10 # repeated single value
assert solution.subarraysWithKDistinct([1,2,1,3,4,2,3], 3) == 6 # mixed transitions
| Test | Why |
|---|---|
[1,2,1,2,3], k=2 |
Verifies provided example |
[1,2,1,3,4], k=3 |
Verifies provided example |
[1,1,1,1], k=1 |
Tests repeated identical elements |
[1,2,3,4], k=4 |
Tests maximum distinct count |
[1,2,3,4], k=1 |
Tests only singleton windows |
[1], k=1 |
Minimum valid input |
[1,2,1,2,1], k=2 |
Tests overlapping valid windows |
[1,2,3], k=2 |
Tests small unique array |
[2,2,2,2], k=1 |
Ensures duplicate handling works |
[1,2,1,3,4,2,3], k=3 |
Complex sliding window behavior |
Edge Cases
One important edge case occurs when all elements are identical.
For example:
nums = [1,1,1,1]
k = 1
Every possible subarray is valid because each contains exactly one distinct integer. A buggy implementation might incorrectly shrink the window or mishandle frequency counts when duplicates appear repeatedly. The sliding window handles this naturally because the distinct count never exceeds 1.
Another important case is when every element is unique.
For example:
nums = [1,2,3,4]
k = 4
Only the entire array is valid. This tests whether the implementation correctly handles the largest possible distinct count. The frequency map grows to size 4, and the algorithm correctly identifies exactly one valid subarray.
A third tricky case occurs when k = 1.
In this situation, only subarrays containing repeated copies of the same number are valid. Many implementations fail here because:
exactly(1) = atMost(1) - atMost(0)
requires the helper function to correctly handle max_distinct = 0. The shrinking logic in this implementation safely reduces the window until it contains zero distinct integers, ensuring correctness.
Another subtle edge case involves frequent window expansion and contraction.
For example:
nums = [1,2,1,3,4]
k = 2
The window repeatedly exceeds the distinct limit and must shrink several times. Incorrect pointer movement or frequency updates can easily produce off-by-one errors. This implementation maintains a strict invariant that the window always contains at most k distinct integers after shrinking, which guarantees accurate counting.