LeetCode 992 - Subarrays with K Different Integers

Difficulty: 🔴 Hard
Topics: Array, Hash Table, Sliding Window, Counting

Solution

Problem Understanding

The problem asks us to count how many contiguous subarrays contain exactly k distinct integers.

We are given:

An integer array nums
An integer k

A subarray is any continuous portion of the array. For each possible subarray, we need to determine whether it contains exactly k unique values. If it does, it contributes 1 to the answer.

For example, in:

nums = [1,2,1,2,3], k = 2

The subarray [1,2,1] is valid because it contains exactly two distinct integers, 1 and 2.

However:

[1,2,1,2,3] is invalid because it contains three distinct integers
[1] is invalid because it contains only one distinct integer

The constraints are important:

nums.length can be as large as 2 * 10^4
A brute force solution that checks every subarray individually would be too slow if it requires recomputing distinct counts repeatedly

Since the number of possible subarrays is O(n²), we need something significantly better than quadratic time.

Several edge cases are important:

Arrays where all elements are identical
Arrays where every element is unique
k = 1
k = nums.length
Very small arrays
Situations where expanding the window frequently exceeds k distinct values

These cases can expose bugs in window shrinking logic or frequency counting.

Approaches

Brute Force Approach

The most direct solution is to generate every possible subarray and count how many distinct integers it contains.

For every starting index i, we extend the subarray to every ending index j. While expanding, we maintain a set or frequency map to track distinct elements.

Whenever the number of distinct elements becomes exactly k, we increment the answer.

This approach is correct because it explicitly checks every subarray. Nothing is missed, and every valid subarray is counted.

However, the time complexity is too large. There are O(n²) subarrays, and although maintaining frequencies can make each extension efficient, the total runtime still becomes quadratic.

With n = 20000, an O(n²) solution is not practical.

Optimal Approach, Sliding Window with At Most K Distinct

The key insight is that counting subarrays with exactly k distinct integers directly is difficult, but counting subarrays with at most k distinct integers is much easier.

We use the identity:

exactly(k) = atMost(k) - atMost(k - 1)

Why does this work?

atMost(k) counts all subarrays containing up to k distinct integers
atMost(k - 1) counts all subarrays containing fewer than k distinct integers
Their difference leaves only the subarrays containing exactly k distinct integers

The sliding window technique works efficiently for the atMost(k) calculation because:

We expand the right pointer
We shrink the left pointer whenever distinct values exceed k
At each position, we can count how many valid subarrays end at the current right index

This transforms the problem into a linear-time solution.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n²)	O(n)	Enumerates every subarray
Optimal	O(n)	O(n)	Sliding window using atMost(k) trick

Algorithm Walkthrough

We first define a helper function:

countAtMost(k)

This function returns the number of subarrays containing at most k distinct integers.

Then:

answer = countAtMost(k) - countAtMost(k - 1)

Step-by-step process

Initialize a sliding window.

We maintain:

left pointer
right pointer
frequency map
number of distinct integers

The window always represents a valid subarray with at most k distinct integers. 2. Expand the window by moving right.

For each new element:

Add it to the frequency map
If its frequency becomes 1, we discovered a new distinct integer

If distinct integers exceed k, shrink the window.

While distinct count is greater than k:

Remove nums[left]
Decrease its frequency
If frequency becomes 0, remove that distinct integer
Move left forward

Count valid subarrays ending at right.

Once the window is valid again:

window length = right - left + 1

Every subarray ending at right and starting anywhere between left and right is valid.

Therefore:

result += right - left + 1

Repeat until the entire array is processed.
Compute:

exactly(k) = atMost(k) - atMost(k - 1)

Why it works

The sliding window always maintains the invariant that the current window contains at most k distinct integers.

For every position right, all subarrays ending at right and starting between left and right are valid. Since the window is minimal after shrinking, every shorter suffix is also valid.

Subtracting atMost(k - 1) removes all subarrays with fewer than k distinct integers, leaving only those with exactly k.

Python Solution

from typing import List
from collections import defaultdict

class Solution:
    def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
        
        def count_at_most(max_distinct: int) -> int:
            left = 0
            result = 0
            
            frequency = defaultdict(int)
            distinct_count = 0
            
            for right in range(len(nums)):
                value = nums[right]
                
                if frequency[value] == 0:
                    distinct_count += 1
                
                frequency[value] += 1
                
                while distinct_count > max_distinct:
                    left_value = nums[left]
                    
                    frequency[left_value] -= 1
                    
                    if frequency[left_value] == 0:
                        distinct_count -= 1
                    
                    left += 1
                
                result += right - left + 1
            
            return result
        
        return count_at_most(k) - count_at_most(k - 1)

The implementation follows the exact algorithm described earlier.

The helper function count_at_most() performs the sliding window logic. The frequency map tracks how many times each integer appears inside the current window.

Whenever a new distinct integer enters the window, distinct_count increases. If the window becomes invalid, meaning distinct integers exceed the allowed limit, we repeatedly shrink the left boundary until the constraint is restored.

The key counting step is:

result += right - left + 1

This works because every suffix of the valid window ending at right is also valid.

Finally, the main method applies the formula:

count_at_most(k) - count_at_most(k - 1)

to isolate subarrays containing exactly k distinct integers.

Go Solution

func subarraysWithKDistinct(nums []int, k int) int {

	countAtMost := func(maxDistinct int) int {
		left := 0
		result := 0

		frequency := make(map[int]int)
		distinctCount := 0

		for right := 0; right < len(nums); right++ {
			value := nums[right]

			if frequency[value] == 0 {
				distinctCount++
			}

			frequency[value]++

			for distinctCount > maxDistinct {
				leftValue := nums[left]

				frequency[leftValue]--

				if frequency[leftValue] == 0 {
					distinctCount--
				}

				left++
			}

			result += right - left + 1
		}

		return result
	}

	return countAtMost(k) - countAtMost(k-1)
}

The Go implementation mirrors the Python version closely.

Instead of Python's defaultdict, Go uses a standard map:

frequency := make(map[int]int)

Go maps return the zero value for missing keys, which makes frequency tracking straightforward.

No special handling for nil or empty slices is required because the constraints guarantee at least one element. Integer overflow is also not an issue because the maximum number of subarrays is within Go's standard integer range for the given constraints.

Worked Examples

Example 1

nums = [1,2,1,2,3]
k = 2

We compute:

exactly(2) = atMost(2) - atMost(1)

Computing atMost(2)

Right	Value	Window	Distinct	Added Count	Total
0	1	[1]	1	1	1
1	2	[1,2]	2	2	3
2	1	[1,2,1]	2	3	6
3	2	[1,2,1,2]	2	4	10
4	3	shrink to [2,3]	2	2	12

Result:

atMost(2) = 12

Computing atMost(1)

Right	Value	Window	Distinct	Added Count	Total
0	1	[1]	1	1	1
1	2	[2]	1	1	2
2	1	[1]	1	1	3
3	2	[2]	1	1	4
4	3	[3]	1	1	5

Result:

atMost(1) = 5

Final answer:

12 - 5 = 7

Example 2

nums = [1,2,1,3,4]
k = 3

We compute:

exactly(3) = atMost(3) - atMost(2)

Computing atMost(3)

Right	Value	Window	Distinct	Added Count	Total
0	1	[1]	1	1	1
1	2	[1,2]	2	2	3
2	1	[1,2,1]	2	3	6
3	3	[1,2,1,3]	3	4	10
4	4	shrink to [1,3,4]	3	3	13

Result:

atMost(3) = 13

Computing atMost(2)

Right	Value	Window	Distinct	Added Count	Total
0	1	[1]	1	1	1
1	2	[1,2]	2	2	3
2	1	[1,2,1]	2	3	6
3	3	shrink to [1,3]	2	2	8
4	4	shrink to [3,4]	2	2	10

Final answer:

13 - 10 = 3

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Each pointer moves at most n times
Space	O(n)	Frequency map may store up to n distinct integers

The sliding window is efficient because each element enters and leaves the window at most once.

Although there is a nested while loop, the left pointer only moves forward across the entire execution. This means the total number of left pointer movements is bounded by n.

The frequency map stores counts for distinct integers currently inside the window. In the worst case, all elements are unique, requiring O(n) extra space.

Test Cases

from typing import List

class Solution:
    def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
        
        from collections import defaultdict
        
        def count_at_most(max_distinct: int) -> int:
            left = 0
            result = 0
            
            frequency = defaultdict(int)
            distinct_count = 0
            
            for right in range(len(nums)):
                value = nums[right]
                
                if frequency[value] == 0:
                    distinct_count += 1
                
                frequency[value] += 1
                
                while distinct_count > max_distinct:
                    left_value = nums[left]
                    
                    frequency[left_value] -= 1
                    
                    if frequency[left_value] == 0:
                        distinct_count -= 1
                    
                    left += 1
                
                result += right - left + 1
            
            return result
        
        return count_at_most(k) - count_at_most(k - 1)

solution = Solution()

assert solution.subarraysWithKDistinct([1,2,1,2,3], 2) == 7  # provided example 1

assert solution.subarraysWithKDistinct([1,2,1,3,4], 3) == 3  # provided example 2

assert solution.subarraysWithKDistinct([1,1,1,1], 1) == 10  # all subarrays valid

assert solution.subarraysWithKDistinct([1,2,3,4], 4) == 1  # entire array only

assert solution.subarraysWithKDistinct([1,2,3,4], 1) == 4  # only single-element subarrays

assert solution.subarraysWithKDistinct([1], 1) == 1  # minimum input size

assert solution.subarraysWithKDistinct([1,2,1,2,1], 2) == 10  # many overlapping windows

assert solution.subarraysWithKDistinct([1,2,3], 2) == 2  # simple unique array

assert solution.subarraysWithKDistinct([2,2,2,2], 1) == 10  # repeated single value

assert solution.subarraysWithKDistinct([1,2,1,3,4,2,3], 3) == 6  # mixed transitions

Test	Why
`[1,2,1,2,3], k=2`	Verifies provided example
`[1,2,1,3,4], k=3`	Verifies provided example
`[1,1,1,1], k=1`	Tests repeated identical elements
`[1,2,3,4], k=4`	Tests maximum distinct count
`[1,2,3,4], k=1`	Tests only singleton windows
`[1], k=1`	Minimum valid input
`[1,2,1,2,1], k=2`	Tests overlapping valid windows
`[1,2,3], k=2`	Tests small unique array
`[2,2,2,2], k=1`	Ensures duplicate handling works
`[1,2,1,3,4,2,3], k=3`	Complex sliding window behavior

Edge Cases

One important edge case occurs when all elements are identical.

For example:

nums = [1,1,1,1]
k = 1

Every possible subarray is valid because each contains exactly one distinct integer. A buggy implementation might incorrectly shrink the window or mishandle frequency counts when duplicates appear repeatedly. The sliding window handles this naturally because the distinct count never exceeds 1.

Another important case is when every element is unique.

For example:

nums = [1,2,3,4]
k = 4

Only the entire array is valid. This tests whether the implementation correctly handles the largest possible distinct count. The frequency map grows to size 4, and the algorithm correctly identifies exactly one valid subarray.

A third tricky case occurs when k = 1.

In this situation, only subarrays containing repeated copies of the same number are valid. Many implementations fail here because:

exactly(1) = atMost(1) - atMost(0)

requires the helper function to correctly handle max_distinct = 0. The shrinking logic in this implementation safely reduces the window until it contains zero distinct integers, ensuring correctness.

Another subtle edge case involves frequent window expansion and contraction.

For example:

nums = [1,2,1,3,4]
k = 2

The window repeatedly exceeds the distinct limit and must shrink several times. Incorrect pointer movement or frequency updates can easily produce off-by-one errors. This implementation maintains a strict invariant that the window always contains at most k distinct integers after shrinking, which guarantees accurate counting.