LeetCode 3795 - Minimum Subarray Length With Distinct Sum At Least K
This problem asks us to find the shortest contiguous subarray whose distinct-value sum is at least k. The important detail is that we do not sum all elements in the subarray.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, Sliding Window
Solution
LeetCode 3795: Minimum Subarray Length With Distinct Sum At Least K
Problem Understanding
This problem asks us to find the shortest contiguous subarray whose distinct-value sum is at least k.
The important detail is that we do not sum all elements in the subarray. Instead, each distinct value contributes to the sum exactly once, regardless of how many times it appears.
For example, consider the subarray [2, 2, 3, 2].
The distinct values are {2, 3}, so the distinct-value sum is:
2 + 3 = 5
The repeated occurrences of 2 do not increase the sum.
The input consists of:
nums, an array of positive integers.k, a target value.
We must return the minimum length of a contiguous subarray whose distinct-value sum is at least k. If no such subarray exists, we return -1.
The constraints are large:
nums.lengthcan be as large as100,000.- Each value can be up to
100,000. kcan be up to1,000,000,000.
These constraints immediately rule out quadratic solutions. Any approach that examines all subarrays will be too slow.
Several edge cases are worth noting:
- A single element may already satisfy the requirement.
- The entire array may still fail to reach
k. - Many duplicate values may appear, which means the distinct-value sum changes only when a value first enters or completely leaves the current window.
- Since all numbers are positive, increasing the window can only increase or preserve the distinct-value sum, never decrease it. This observation is crucial for the optimal solution.
Approaches
Brute Force
A straightforward solution is to examine every possible subarray.
For each starting index, we extend the ending index one position at a time. While extending, we maintain a set of distinct values and compute the distinct-value sum.
Whenever the distinct-value sum becomes at least k, we update the answer.
This approach is correct because every subarray is checked. However, there are O(n²) subarrays, making it far too slow for n = 100,000.
Key Insight
The key observation is that all numbers are positive.
Suppose we maintain a sliding window. The distinct-value sum of the window is the sum of values that currently appear at least once inside the window.
When we expand the right boundary:
- If a value appears for the first time in the window, the distinct-value sum increases by that value.
- Otherwise, the distinct-value sum stays unchanged.
When we move the left boundary:
- If a value still exists elsewhere in the window, the distinct-value sum stays unchanged.
- If its frequency becomes zero, that value is no longer present, so we subtract it from the distinct-value sum.
Because all values are positive, the distinct-value sum behaves monotonically with respect to window expansion. This allows the classic sliding-window pattern:
- Expand right until the condition becomes valid.
- Shrink left as much as possible while keeping the condition valid.
- Record the minimum length encountered.
This gives a linear-time solution.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | Checks every subarray |
| Optimal Sliding Window | O(n) | O(n) | Each pointer moves at most n times |
Algorithm Walkthrough
- Create a frequency map that stores how many times each value appears inside the current window.
- Maintain two pointers,
leftandright, representing the current sliding window. - Maintain a variable
distinct_sum, which stores the sum of all distinct values currently present in the window. - Expand the window by moving
rightfrom left to right through the array. - When adding
nums[right]:
- If its frequency is currently zero, this value is entering the window for the first time, so add its value to
distinct_sum. - Increase its frequency.
- After adding the new element, check whether
distinct_sum >= k. - While the condition remains true:
- Update the answer using the current window length.
- Remove the leftmost element from the window.
- Decrease its frequency.
- If its frequency becomes zero, subtract that value from
distinct_sumbecause it is no longer present in the window. - Move
leftforward.
- Continue until all elements have been processed.
- If no valid window was found, return
-1. Otherwise, return the minimum length recorded.
Why it Works
The sliding window always represents a valid contiguous subarray. The frequency map accurately tracks which values are currently present. Therefore, distinct_sum always equals the sum of distinct values inside the window.
Whenever distinct_sum >= k, the current window satisfies the requirement. Shrinking from the left finds the smallest valid window ending at the current right boundary. Since every possible minimal valid window is considered exactly once, the smallest valid length overall is found.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def minLength(self, nums: List[int], k: int) -> int:
freq = defaultdict(int)
left = 0
distinct_sum = 0
answer = float("inf")
for right, value in enumerate(nums):
if freq[value] == 0:
distinct_sum += value
freq[value] += 1
while distinct_sum >= k:
answer = min(answer, right - left + 1)
left_value = nums[left]
freq[left_value] -= 1
if freq[left_value] == 0:
distinct_sum -= left_value
left += 1
return -1 if answer == float("inf") else answer
The implementation follows the sliding-window strategy directly.
The frequency map tracks how many copies of each value exist in the current window. Whenever a value's frequency changes from 0 to 1, that value becomes newly present and is added to distinct_sum.
As the right pointer expands the window, the distinct-value sum is updated incrementally. Once the window satisfies the requirement, the inner loop repeatedly removes elements from the left side. This shrinking process continues until the window would no longer satisfy the condition.
The minimum valid length encountered during this process is stored in answer.
Go Solution
func minLength(nums []int, k int) int {
freq := make(map[int]int)
left := 0
distinctSum := 0
ans := len(nums) + 1
for right, value := range nums {
if freq[value] == 0 {
distinctSum += value
}
freq[value]++
for distinctSum >= k {
length := right - left + 1
if length < ans {
ans = length
}
leftValue := nums[left]
freq[leftValue]--
if freq[leftValue] == 0 {
distinctSum -= leftValue
}
left++
}
}
if ans == len(nums)+1 {
return -1
}
return ans
}
The Go implementation is nearly identical to the Python version. A map[int]int is used for frequencies. The answer is initialized to len(nums) + 1 instead of infinity. Since the maximum possible valid length is len(nums), this sentinel value conveniently indicates that no valid window was found.
The problem constraints guarantee that the distinct-value sum never exceeds 100000 * 100000 = 10^10, which fits comfortably within Go's 64-bit integer environments. Under LeetCode's standard Go runtime, int is sufficient.
Worked Examples
Example 1
Input:
nums = [2,2,3,1]
k = 4
| Step | Right Value | Window | Frequency Map | Distinct Sum | Action |
|---|---|---|---|---|---|
| 1 | 2 | [2] | {2:1} | 2 | Not valid |
| 2 | 2 | [2,2] | {2:2} | 2 | Not valid |
| 3 | 3 | [2,2,3] | {2:2,3:1} | 5 | Valid |
| 4 | Shrink | [2,3] | {2:1,3:1} | 5 | Length = 2 |
| 5 | Shrink | [3] | {3:1} | 3 | Stop shrinking |
| 6 | 1 | [3,1] | {3:1,1:1} | 4 | Valid |
| 7 | Shrink | [1] | {1:1} | 1 | Length = 2 remains |
Answer = 2
Example 2
Input:
nums = [3,2,3,4]
k = 5
| Step | Right Value | Window | Distinct Sum |
|---|---|---|---|
| 1 | 3 | [3] | 3 |
| 2 | 2 | [3,2] | 5 |
| 3 | Valid | [3,2] | 5 |
| 4 | Record length 2 | ||
| 5 | Remove 3 | [2] | 2 |
| 6 | Add 3 | [2,3] | 5 |
| 7 | Record length 2 | ||
| 8 | Add 4 | [3,4] eventually | 7 |
Minimum length remains 2.
Example 3
Input:
nums = [5,5,4]
k = 5
| Step | Right Value | Window | Distinct Sum | Best |
|---|---|---|---|---|
| 1 | 5 | [5] | 5 | 1 |
| 2 | Remove 5 | [] | 0 | 1 |
| 3 | 5 | [5] | 5 | 1 |
| 4 | 4 | [5,4] | 9 | 1 |
Answer = 1.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each element enters and leaves the window at most once |
| Space | O(n) | Frequency map may store all distinct values |
The left and right pointers each move monotonically from left to right. Neither pointer ever moves backward. As a result, every element is processed at most twice, once when entering the window and once when leaving it. This yields linear time complexity. The frequency map stores one entry per distinct value currently present, requiring up to O(n) space in the worst case.
Test Cases
sol = Solution()
assert sol.minLength([2, 2, 3, 1], 4) == 2 # Example 1
assert sol.minLength([3, 2, 3, 4], 5) == 2 # Example 2
assert sol.minLength([5, 5, 4], 5) == 1 # Example 3
assert sol.minLength([1], 1) == 1 # Single element valid
assert sol.minLength([1], 2) == -1 # Single element invalid
assert sol.minLength([2, 2, 2, 2], 2) == 1 # All duplicates
assert sol.minLength([2, 2, 2, 2], 3) == -1 # Distinct sum never reaches target
assert sol.minLength([1, 2, 3], 6) == 3 # Entire array required
assert sol.minLength([1, 2, 3], 7) == -1 # Impossible target
assert sol.minLength([1, 2, 1, 2, 3], 5) == 3 # Duplicates inside window
assert sol.minLength([10, 1, 1, 1], 10) == 1 # Single large value
assert sol.minLength([1, 2, 3, 4, 5], 9) == 2 # Window [4,5]
assert sol.minLength([100000] * 1000, 100000) == 1 # Large repeated values
assert sol.minLength([1, 2, 3, 2, 1, 4], 7) == 2 # Window [3,4]
Test Summary
| Test | Why |
|---|---|
[2,2,3,1], 4 |
Example 1 |
[3,2,3,4], 5 |
Example 2 |
[5,5,4], 5 |
Example 3 |
[1], 1 |
Smallest valid input |
[1], 2 |
Smallest impossible input |
[2,2,2,2], 2 |
Duplicate values only |
[2,2,2,2], 3 |
Distinct sum never sufficient |
[1,2,3], 6 |
Entire array needed |
[1,2,3], 7 |
No solution exists |
[1,2,1,2,3], 5 |
Repeated values inside window |
[10,1,1,1], 10 |
Single element answer |
[1,2,3,4,5], 9 |
Minimal window appears near end |
[100000]*1000, 100000 |
Large duplicate stress test |
[1,2,3,2,1,4], 7 |
Multiple shrinking operations |
Edge Cases
A Single Element Already Meets the Target
A common mistake is assuming the answer must involve multiple elements. For example, nums = [5,5,4] and k = 5. The first element alone satisfies the requirement. The sliding window immediately detects this because the distinct-value sum becomes 5 after adding the first element, and the answer is updated to 1.
All Elements Are Duplicates
Consider nums = [2,2,2,2] and k = 3. A naive implementation that sums all elements would incorrectly conclude that many subarrays satisfy the condition. However, the distinct-value sum is always just 2, because only one distinct value exists. The frequency map ensures duplicates are counted exactly once, producing the correct answer of -1.
No Valid Subarray Exists
Suppose nums = [1,2,3] and k = 7. The maximum possible distinct-value sum is 1 + 2 + 3 = 6, which is still below the target. The algorithm never enters the shrinking loop because the condition is never satisfied. The answer remains at its sentinel value, causing the function to correctly return -1.
Values Leaving the Window Completely
Consider a window containing [2,2,3]. When the leftmost 2 is removed, the distinct-value sum should remain unchanged because another 2 is still present. Only when the final occurrence of 2 leaves the window should 2 be subtracted from the distinct-value sum. The frequency map precisely tracks this transition, preventing incorrect updates to the distinct-value sum.