LeetCode 3636 - Threshold Majority Queries
The problem asks us to process multiple queries on a given array nums where each query specifies a subarray defined by [li, ri] and a threshold. For each query, we need to find an element in the subarray that appears at least threshold times.
Difficulty: 🔴 Hard
Topics: Array, Hash Table, Binary Search, Divide and Conquer, Counting, Prefix Sum
Solution
Problem Understanding
The problem asks us to process multiple queries on a given array nums where each query specifies a subarray defined by [li, ri] and a threshold. For each query, we need to find an element in the subarray that appears at least threshold times. If multiple elements meet this criterion, we select the one with the highest frequency, and if there is a tie in frequency, we choose the smallest value. If no element meets the threshold, we return -1 for that query.
The input array can have up to 10,000 elements, and there can be up to 50,000 queries. This scale implies that a naive solution that counts elements for each query separately would be too slow. Each query could be as large as the entire array, and the element values can be large (up to 10^9), so direct frequency counting without careful design may be inefficient. The constraints guarantee valid query ranges and threshold values, but the wide range of element values and the number of queries are the main challenge.
Important edge cases include subarrays where no element meets the threshold, subarrays where multiple elements meet the threshold with equal frequencies, and queries on single-element subarrays.
Approaches
A brute-force approach would iterate over each query, slice the subarray, count frequencies using a hash map, and then select the element that meets the threshold with the highest frequency. This approach is straightforward and correct but inefficient. Specifically, slicing and counting for each query results in O(Q * N) time complexity in the worst case (where Q is the number of queries and N is the array length). Given the constraints (N ≤ 10^4, Q ≤ 5*10^4), this approach is too slow.
The optimal solution uses a combination of position mapping and binary search for efficient frequency counting within any subarray. By preprocessing the positions of each element in nums, we can quickly determine how many times an element appears within any query range using two binary searches. This avoids iterating through the subarray repeatedly. Additionally, since only elements that appear frequently in the entire array can satisfy high thresholds, we can optimize by checking only candidates that occur sufficiently often globally.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(Q * N) | O(N) | Count each subarray separately |
| Optimal | O(Q * log N * K) | O(N + K) | Preprocess positions, binary search for counts in subarrays, K is number of candidate elements per query |
Algorithm Walkthrough
- Preprocess the array by building a dictionary mapping each unique element to a sorted list of its indices in
nums. This allows fast lookup of an element's occurrences within any subarray. - For each query
[li, ri, threshold], iterate over the unique elements innums(or maintain a candidate list of elements that appear often enough to meet the threshold). - For each candidate element, use binary search on its list of indices to count how many times it appears between indices
liandri. The count is determined by finding the first index not less thanliand the first index greater thanri, then subtracting the positions. - Keep track of the element that meets the threshold and has the highest frequency. In case of a tie in frequency, select the smaller element.
- If no element satisfies the threshold, append -1 to the results. Otherwise, append the chosen element.
Why it works: Preprocessing positions guarantees O(log N) access to frequency counts in any subarray. By considering only elements that have enough occurrences in the subarray, we ensure correctness and efficiency. Using binary search avoids scanning the subarray explicitly.
Python Solution
from typing import List
from collections import defaultdict
import bisect
class Solution:
def subarrayMajority(self, nums: List[int], queries: List[List[int]]) -> List[int]:
positions = defaultdict(list)
for idx, num in enumerate(nums):
positions[num].append(idx)
result = []
for li, ri, threshold in queries:
candidate = -1
max_freq = 0
for num, pos_list in positions.items():
left = bisect.bisect_left(pos_list, li)
right = bisect.bisect_right(pos_list, ri)
freq = right - left
if freq >= threshold:
if freq > max_freq or (freq == max_freq and num < candidate):
max_freq = freq
candidate = num
result.append(candidate)
return result
The implementation first constructs a dictionary of positions for each element. For each query, we calculate the frequency of each element in the subarray using binary search. We track the element with the highest frequency that meets the threshold and handle ties by choosing the smaller element.
Go Solution
package main
import (
"sort"
)
func subarrayMajority(nums []int, queries [][]int) []int {
positions := make(map[int][]int)
for i, num := range nums {
positions[num] = append(positions[num], i)
}
result := make([]int, 0, len(queries))
for _, q := range queries {
li, ri, threshold := q[0], q[1], q[2]
candidate := -1
maxFreq := 0
for num, posList := range positions {
left := sort.Search(len(posList), func(i int) bool { return posList[i] >= li })
right := sort.Search(len(posList), func(i int) bool { return posList[i] > ri })
freq := right - left
if freq >= threshold {
if freq > maxFreq || (freq == maxFreq && num < candidate) {
maxFreq = freq
candidate = num
}
}
}
result = append(result, candidate)
}
return result
}
The Go version mirrors the Python implementation, using sort.Search for binary searches and a map to store positions. Care is taken to append results efficiently.
Worked Examples
Example 1
nums = [1,1,2,2,1,1], queries = [[0,5,4],[0,3,3],[2,3,2]]
| Query | Candidate Frequencies | Result |
|---|---|---|
| [0,5,4] | 1 → 4, 2 → 2 | 1 |
| [0,3,3] | 1 → 2, 2 → 2 | -1 |
| [2,3,2] | 2 → 2 | 2 |
Example 2
nums = [3,2,3,2,3,2,3], queries = [[0,6,4],[1,5,2],[2,4,1],[3,3,1]]
| Query | Candidate Frequencies | Result |
|---|---|---|
| [0,6,4] | 3 → 4, 2 → 3 | 3 |
| [1,5,2] | 2 → 3, 3 → 2 | 2 |
| [2,4,1] | 3 → 2, 2 → 1 | 3 |
| [3,3,1] | 2 → 1 | 2 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(Q * K * log N) | Q queries, K candidate elements, log N for binary search per element |
| Space | O(N + K) | Store positions of elements and output array |
The complexity is reasonable given constraints. The number of candidate elements K is often much smaller than N, especially when thresholds are high.
Test Cases
# Basic provided examples
assert Solution().subarrayMajority([1,1,2,2,1,1], [[0,5,4],[0,3,3],[2,3,2]]) == [1,-1,2]
assert Solution().subarrayMajority([3,2,3,2,3,2,3], [[0,6,4],[1,5,2],[2,4,1],[3,3,1]]) == [3,2,3,2]
# Edge case: single element array
assert Solution().subarrayMajority([1], [[0,0,1]]) == [1]
# Edge case: no element meets threshold
assert Solution().subarrayMajority([1,2,3], [[0,2,4]]) == [-1]
# Edge case: tie in frequency, smallest element selected
assert Solution().subarrayMajority([1,2,2,1], [[0,3,2]]) == [1]
# Large values
assert Solution().subarrayMajority([10**9, 10**9, 1], [[0,2,2]]) == [1000000000]
| Test | Why |
|---|---|
| Provided examples | Validate main functionality |
| Single element | Handles minimal array size |
| No element meets threshold | Validates correct -1 output |
| Tie in frequency | Ensures smallest element selected |
| Large values | Confirms correctness with max integer values |
Edge Cases
A key edge case is a query on a single-element subarray. Here, the threshold can only be satisfied by that element itself. Another edge case is when no element meets the threshold; the implementation correctly returns -1. A third