LeetCode 2080 - Range Frequency Queries

LeetCode Problem 2080

Difficulty: 🟡 Medium
Topics: Array, Hash Table, Binary Search, Design, Segment Tree

Solution

Problem Understanding

The problem asks us to design a data structure that can efficiently answer frequency queries on subarrays.

We are given an integer array arr, and we need to support repeated queries of the form:

Given left, right, and value
Return how many times value appears in the subarray arr[left...right]

The important detail is that the data structure is built once, but queried many times. The constraints allow up to 10^5 elements in the array and up to 10^5 queries, which immediately tells us that a naive per-query scan is too slow.

For example, suppose:

arr = [12, 33, 4, 56, 22, 2, 34, 33, 22, 12, 34, 56]

If we query:

query(1, 2, 4)

we examine the subarray:

[33, 4]

The value 4 appears once, so the answer is 1.

The constraints reveal several important properties:

The array is static, meaning it never changes after construction.
We only need to answer read-only frequency queries.
Query performance matters much more than update performance because there are no updates.
Values are bounded by 10^4, but indices go up to 10^5.

These observations strongly suggest preprocessing the array so queries become fast.

Several edge cases are important:

The queried value may not exist anywhere in the array.
The queried range may contain exactly one element.
All elements may be identical.
The queried value may occur many times outside the query range but not inside it.
The query range may span the entire array.

A correct solution must handle all of these efficiently.

Approaches

Brute Force Approach

The most straightforward solution is to scan the subarray for every query.

For each query:

Iterate from index left to index right
Count how many elements equal value
Return the count

This approach is correct because it directly computes the frequency by examining every relevant element.

However, it is far too slow for the given constraints.

If the array length is 10^5 and we also have 10^5 queries, then in the worst case each query scans nearly the entire array.

That leads to:

O(n * q) = O(10^10)

operations, which is not feasible.

Optimal Approach

The key observation is that we do not need to repeatedly scan the array.

Instead, we can preprocess the array and store, for each distinct value, the sorted list of indices where that value appears.

For example:

arr = [12, 33, 4, 56, 22, 2, 34, 33]

We can build:

12 -> [0]
33 -> [1, 7]
4  -> [2]
56 -> [3]
22 -> [4]
2  -> [5]
34 -> [6]

Now consider the query:

query(1, 7, 33)

We already know all positions of 33:

[1, 7]

We simply need to count how many indices fall within [1, 7].

Because the index lists are sorted, we can use binary search:

Find the first index >= left
Find the first index > right
Their difference gives the count

This transforms each query from linear time into logarithmic time.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n) per query	O(1)	Scans the entire query range every time
Optimal	O(log k) per query	O(n)	Stores sorted index lists and uses binary search

Here, k is the number of occurrences of the queried value.

Algorithm Walkthrough

Step 1: Build a Mapping from Values to Indices

During initialization, iterate through the array.

For each element:

Use the value as a key in a hash map
Append the current index to that value's list

For example:

arr = [1, 3, 1, 2, 1]

Produces:

1 -> [0, 2, 4]
3 -> [1]
2 -> [3]

The lists are automatically sorted because we process the array from left to right.

Step 2: Process a Query

Suppose we receive:

query(left, right, value)

First, check whether value exists in the map.

If it does not exist, return 0.

Otherwise, retrieve the sorted index list for that value.

Example:

value = 1
indices = [0, 2, 4]

Step 3: Find the Left Boundary

Use binary search to locate the first position in indices where the index is greater than or equal to left.

This gives the beginning of the valid range.

Step 4: Find the Right Boundary

Use another binary search to locate the first position in indices where the index is greater than right.

This gives the position immediately after the valid range.

Step 5: Compute the Frequency

The number of valid occurrences is:

right_position - left_position

Because all indices between those two positions lie inside the query range.

Why it works

For every value, we store all occurrence indices in sorted order.

Binary search correctly identifies:

The first occurrence inside the query range
The first occurrence after the query range

Every index between those two boundaries must belong to the interval [left, right], and every valid occurrence is included exactly once.

Therefore, the difference between the two binary search positions equals the exact frequency.

Python Solution

from collections import defaultdict
from bisect import bisect_left, bisect_right
from typing import List

class RangeFreqQuery:

    def __init__(self, arr: List[int]):
        self.positions = defaultdict(list)

        for index, value in enumerate(arr):
            self.positions[value].append(index)

    def query(self, left: int, right: int, value: int) -> int:
        if value not in self.positions:
            return 0

        indices = self.positions[value]

        left_pos = bisect_left(indices, left)
        right_pos = bisect_right(indices, right)

        return right_pos - left_pos

# Your RangeFreqQuery object will be instantiated and called as such:
# obj = RangeFreqQuery(arr)
# param_1 = obj.query(left,right,value)

The implementation begins by constructing a hash map named positions.

Each key is a value from the array, and its corresponding value is a list of all indices where that number appears.

Because we iterate through the array in increasing index order, every list remains sorted automatically.

During queries, we first check whether the requested value exists in the map. If it does not, the frequency is immediately 0.

Otherwise, we retrieve the sorted index list and use:

bisect_left to find the first index greater than or equal to left
bisect_right to find the first index greater than right

The difference between these positions gives the number of occurrences inside the interval.

This works because binary search efficiently counts how many sorted indices fall within a range.

Go Solution

package main

import "sort"

type RangeFreqQuery struct {
	positions map[int][]int
}

func Constructor(arr []int) RangeFreqQuery {
	rfq := RangeFreqQuery{
		positions: make(map[int][]int),
	}

	for index, value := range arr {
		rfq.positions[value] = append(rfq.positions[value], index)
	}

	return rfq
}

func (this *RangeFreqQuery) Query(left int, right int, value int) int {
	indices, exists := this.positions[value]

	if !exists {
		return 0
	}

	leftPos := sort.SearchInts(indices, left)
	rightPos := sort.Search(len(indices), func(i int) bool {
		return indices[i] > right
	})

	return rightPos - leftPos
}

/**
 * Your RangeFreqQuery object will be instantiated and called as such:
 * obj := Constructor(arr);
 * param_1 := obj.Query(left,right,value);
 */

The Go solution follows the same algorithmic structure as the Python version.

Instead of Python's bisect module, Go uses:

sort.SearchInts for the lower bound
sort.Search with a custom condition for the upper bound

Go maps naturally support slices as values, making it easy to store occurrence indices.

Because the array indices are appended in order, the slices remain sorted automatically, which allows binary search to work correctly.

Worked Examples

Example 1

arr = [12, 33, 4, 56, 22, 2, 34, 33, 22, 12, 34, 56]

Preprocessing Phase

Index	Value	Map State
0	12	12 -> [0]
1	33	33 -> [1]
2	4	4 -> [2]
3	56	56 -> [3]
4	22	22 -> [4]
5	2	2 -> [5]
6	34	34 -> [6]
7	33	33 -> [1, 7]
8	22	22 -> [4, 8]
9	12	12 -> [0, 9]
10	34	34 -> [6, 10]
11	56	56 -> [3, 11]

Final map:

12 -> [0, 9]
33 -> [1, 7]
4  -> [2]
56 -> [3, 11]
22 -> [4, 8]
2  -> [5]
34 -> [6, 10]

Query 1

query(1, 2, 4)

Retrieve:

indices = [2]

Binary searches:

Operation	Result
first index >= 1	0
first index > 2	1

Frequency:

1 - 0 = 1

Answer:

Query 2

query(0, 11, 33)

Retrieve:

indices = [1, 7]

Binary searches:

Operation	Result
first index >= 0	0
first index > 11	2

Frequency:

2 - 0 = 2

Answer:

Complexity Analysis

Measure	Complexity	Explanation
Time	O(log k) per query	Binary searches on occurrence list
Space	O(n)	Every array index is stored once

The preprocessing phase takes O(n) time because we scan the array once.

Each query performs two binary searches on the occurrence list of the queried value.

If the value appears k times, binary search takes O(log k) time.

The total space complexity is O(n) because every array index is stored exactly once across all lists.

Test Cases

from collections import defaultdict
from bisect import bisect_left, bisect_right

class RangeFreqQuery:

    def __init__(self, arr):
        self.positions = defaultdict(list)

        for index, value in enumerate(arr):
            self.positions[value].append(index)

    def query(self, left, right, value):
        if value not in self.positions:
            return 0

        indices = self.positions[value]

        return (
            bisect_right(indices, right)
            - bisect_left(indices, left)
        )

# Provided example
rfq = RangeFreqQuery([12, 33, 4, 56, 22, 2, 34, 33, 22, 12, 34, 56])
assert rfq.query(1, 2, 4) == 1  # single occurrence in small range
assert rfq.query(0, 11, 33) == 2  # multiple occurrences in full range

# Single element array
rfq = RangeFreqQuery([5])
assert rfq.query(0, 0, 5) == 1  # exact match
assert rfq.query(0, 0, 1) == 0  # missing value

# All elements identical
rfq = RangeFreqQuery([7, 7, 7, 7, 7])
assert rfq.query(0, 4, 7) == 5  # entire array
assert rfq.query(1, 3, 7) == 3  # middle subarray

# Value exists outside range but not inside
rfq = RangeFreqQuery([1, 2, 3, 1, 2, 3])
assert rfq.query(1, 2, 1) == 0  # occurrences outside interval only

# Query at boundaries
rfq = RangeFreqQuery([4, 5, 4, 5, 4])
assert rfq.query(0, 0, 4) == 1  # left boundary
assert rfq.query(4, 4, 4) == 1  # right boundary

# Large repeated blocks
rfq = RangeFreqQuery([1] * 1000 + [2] * 1000)
assert rfq.query(0, 999, 1) == 1000  # first block
assert rfq.query(1000, 1999, 2) == 1000  # second block
assert rfq.query(500, 1500, 2) == 501  # partial overlap

# Missing value entirely
rfq = RangeFreqQuery([1, 2, 3])
assert rfq.query(0, 2, 10) == 0  # value absent from array

Test	Why
Provided example	Validates standard functionality
Single element array	Tests smallest valid input
All elements identical	Tests large frequency counts
Value outside range	Ensures range filtering works correctly
Boundary queries	Verifies inclusive index handling
Large repeated blocks	Tests performance-oriented scenarios
Missing value entirely	Confirms correct handling of absent keys

Edge Cases

Querying a Value That Does Not Exist

A common source of bugs is attempting to binary search an index list for a value that never appears in the array.

For example:

arr = [1, 2, 3]
query(0, 2, 10)

If the implementation blindly accesses the map, it may cause errors or incorrect behavior.

The solution handles this safely by checking:

if value not in self.positions:
    return 0

This guarantees correctness for absent values.

Single Element Ranges

Ranges where left == right can expose off-by-one mistakes.

Example:

arr = [5]
query(0, 0, 5)

The correct answer is 1.

Using bisect_left and bisect_right carefully preserves inclusive boundaries, ensuring that a single valid index is counted correctly.

Values Appearing Outside but Not Inside the Range

Another subtle case occurs when the value exists in the array but not inside the requested interval.

Example:

arr = [1, 2, 3, 1]
query(1, 2, 1)

The value 1 exists globally but not inside [1, 2].

The binary searches correctly isolate only indices within the requested range, producing 0 rather than counting all occurrences globally.