LeetCode 710 - Random Pick with Blacklist
The problem asks us to design a data structure that can repeatedly return a random integer from the range [0, n - 1], while excluding all integers that appear in a blacklist. Every valid number must have exactly the same probability of being chosen.
Difficulty: 🔴 Hard
Topics: Array, Hash Table, Math, Binary Search, Sorting, Randomized
Solution
Problem Understanding
The problem asks us to design a data structure that can repeatedly return a random integer from the range [0, n - 1], while excluding all integers that appear in a blacklist. Every valid number must have exactly the same probability of being chosen.
The challenge is not simply filtering numbers. The key requirement is efficiency. The value of n can be as large as 10^9, which means we cannot explicitly build or store the entire range of valid numbers. At the same time, the blacklist can contain up to 10^5 values, which is relatively manageable.
A naive implementation might repeatedly generate random numbers until it finds one that is not blacklisted. While this works conceptually, it becomes inefficient when the blacklist is large. Imagine a case where almost every number is blacklisted. The algorithm could spend many retries before finding a valid value.
The problem specifically mentions minimizing calls to the built-in random function. This strongly suggests that the ideal solution should require exactly one random call per pick() operation.
The input consists of:
n, the upper bound of the rangeblacklist, a list of unique integers that cannot be returned
The expected output of pick() is a uniformly random valid number.
The constraints tell us several important things:
nis extremely large, so array-based solutions over the entire range are impossibleblacklist.lengthis much smaller thann- Up to
2 * 10^4calls topick()may occur, so each call should ideally beO(1)
There are several edge cases that matter:
- The blacklist may be empty
- The blacklist may contain almost all numbers
- Blacklisted numbers may cluster at the beginning or end of the range
nmay be extremely large even though the blacklist is small
The problem guarantees that blacklist values are unique and within bounds, which simplifies preprocessing.
Approaches
Brute Force Approach
The most straightforward idea is to repeatedly generate random integers in the range [0, n - 1] until we find one that is not blacklisted.
To support fast blacklist lookup, we can store all blacklisted numbers in a hash set. During pick(), we generate a random number and check whether it exists in the set. If it does, we try again.
This approach is correct because every generated number is uniformly random, and we only reject invalid choices. Eventually, we return a valid number.
However, this solution can become extremely inefficient when the blacklist is large. Suppose only one valid number exists. In that case, almost every random call fails, leading to many retries. The expected runtime becomes poor, and the number of random calls becomes excessive.
Optimal Approach
The key insight is that we do not actually need to generate numbers from the entire range [0, n - 1].
Suppose the blacklist contains b numbers. Then there are exactly n - b valid numbers. If we could somehow map the valid numbers into a compact range, we could generate random values only from that smaller space.
Define:
size = n - len(blacklist)
We generate random integers only in [0, size - 1].
Now consider two cases:
- If a number in this range is not blacklisted, we can return it directly.
- If it is blacklisted, we remap it to some valid number in the upper part of the range.
The upper range [size, n - 1] contains enough unused valid numbers to replace all blacklisted values that appear in the lower range.
This transforms the problem into a hash map remapping problem.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | pick(): O(expected retries) |
O(b) |
Repeatedly retries until a valid number is found |
| Optimal | O(b) preprocessing, O(1) pick |
O(b) |
Uses remapping to guarantee one random call |
Algorithm Walkthrough
Step 1: Compute the Valid Range Size
We first compute:
size = n - len(blacklist)
This tells us how many valid numbers exist overall.
Instead of generating numbers from [0, n - 1], we will only generate numbers from [0, size - 1].
This guarantees exactly one random call per pick().
Step 2: Store Blacklisted Numbers in a Set
We place all blacklist values into a hash set.
This allows constant-time lookup when determining whether a number is blacklisted.
Step 3: Identify Blacklisted Numbers in the Lower Range
Some blacklisted numbers may already lie in the upper range [size, n - 1].
Those do not matter because we never randomly generate them.
The only problematic values are blacklisted numbers less than size, because those may be randomly selected.
These values must be remapped.
Step 4: Find Replacement Values
We scan the upper portion of the range, starting from size.
Whenever we encounter a number that is not blacklisted, we use it as a replacement.
For every blacklisted value b < size, we create:
mapping[b] = replacement_value
This guarantees that every invalid lower-range value maps to a valid upper-range value.
Step 5: Generate Random Numbers
During pick():
- Generate a random integer
xin[0, size - 1] - If
xis remapped, returnmapping[x] - Otherwise return
x
This produces a perfectly uniform distribution over all valid numbers.
Why it works
The algorithm works because the lower range [0, size - 1] contains exactly as many positions as there are valid numbers overall.
Every blacklisted value in this lower range is replaced by a unique valid value from the upper range. Therefore:
- Every random index corresponds to exactly one valid output
- No valid number appears more than once
- Every valid number has equal probability
This creates a one-to-one correspondence between random choices and valid outputs.
Python Solution
import random
from typing import List
class Solution:
def __init__(self, n: int, blacklist: List[int]):
self.size = n - len(blacklist)
blacklist_set = set(blacklist)
self.mapping = {}
candidate = self.size
for blocked in blacklist:
if blocked < self.size:
while candidate in blacklist_set:
candidate += 1
self.mapping[blocked] = candidate
candidate += 1
def pick(self) -> int:
value = random.randint(0, self.size - 1)
if value in self.mapping:
return self.mapping[value]
return value
# Your Solution object will be instantiated and called as such:
# obj = Solution(n, blacklist)
# param_1 = obj.pick()
The implementation begins by computing the number of valid integers, stored in self.size. This value determines the range used during random generation.
The blacklist is converted into a set for efficient membership testing. This is important because we repeatedly check whether candidate replacement values are blacklisted.
The mapping dictionary stores remappings for blacklisted numbers in the lower range. Only numbers smaller than self.size require remapping because those are the only numbers that can be randomly selected.
The variable candidate scans the upper range. Whenever a non-blacklisted number is found, it becomes the replacement target for a lower-range blacklisted value.
The pick() method performs exactly one random call. If the generated number exists in the remapping table, the mapped value is returned. Otherwise, the number itself is already valid.
Go Solution
package main
import (
"math/rand"
)
type Solution struct {
size int
mapping map[int]int
}
func Constructor(n int, blacklist []int) Solution {
size := n - len(blacklist)
blackSet := make(map[int]bool)
for _, num := range blacklist {
blackSet[num] = true
}
mapping := make(map[int]int)
candidate := size
for _, blocked := range blacklist {
if blocked < size {
for blackSet[candidate] {
candidate++
}
mapping[blocked] = candidate
candidate++
}
}
return Solution{
size: size,
mapping: mapping,
}
}
func (this *Solution) Pick() int {
value := rand.Intn(this.size)
if mapped, exists := this.mapping[value]; exists {
return mapped
}
return value
}
/**
* Your Solution object will be instantiated and called as such:
* obj := Constructor(n, blacklist);
* param_1 := obj.Pick();
*/
The Go implementation follows the same logic as the Python version, but uses Go maps instead of Python dictionaries and sets.
Go does not have a built-in set type, so map[int]bool is used to represent the blacklist set.
Random generation uses rand.Intn(this.size), which produces integers in [0, size).
Because n is at most 10^9, standard Go int values are sufficient on LeetCode environments.
Worked Examples
Example 1
Input:
n = 7
blacklist = [2, 3, 5]
Step 1: Compute Size
size = 7 - 3 = 4
We only generate random numbers in:
[0, 3]
Step 2: Blacklist Set
| Value |
|---|
| 2 |
| 3 |
| 5 |
Step 3: Identify Problematic Lower Values
The lower range is [0, 3].
Blacklisted values in this range:
| Value | Needs Mapping |
|---|---|
| 2 | Yes |
| 3 | Yes |
Step 4: Find Replacement Values
Upper range starts at 4.
| Candidate | Blacklisted? | Used? |
|---|---|---|
| 4 | No | Map to 2 |
| 5 | Yes | Skip |
| 6 | No | Map to 3 |
Final mapping:
| Blacklisted Lower Value | Replacement |
|---|---|
| 2 | 4 |
| 3 | 6 |
Step 5: Random Picks
| Random Value | Returned Value |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 2 | 4 |
| 3 | 6 |
Every valid number {0,1,4,6} appears exactly once.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(b) preprocessing, O(1) pick |
Each blacklist value is processed once |
| Space | O(b) |
Hash set and remapping table |
The preprocessing phase iterates through blacklist values and potentially scans upper-range candidates. However, each candidate is visited at most once, so the total preprocessing remains linear in the blacklist size.
The pick() operation performs one random generation and at most one hash map lookup, giving constant-time performance.
Test Cases
import collections
# Example case
obj = Solution(7, [2, 3, 5])
for _ in range(100):
assert obj.pick() in {0, 1, 4, 6} # example validity
# Empty blacklist
obj = Solution(5, [])
for _ in range(100):
assert obj.pick() in {0, 1, 2, 3, 4} # all values allowed
# Single valid number
obj = Solution(4, [0, 1, 2])
for _ in range(20):
assert obj.pick() == 3 # only one possible answer
# Blacklist only upper values
obj = Solution(6, [4, 5])
for _ in range(100):
assert obj.pick() in {0, 1, 2, 3} # no remapping needed
# Blacklist only lower values
obj = Solution(6, [0, 1])
for _ in range(100):
assert obj.pick() in {2, 3, 4, 5} # remapping required
# Large n with small blacklist
obj = Solution(10**9, [5, 10, 15])
for _ in range(100):
value = obj.pick()
assert value not in {5, 10, 15} # large range support
# Distribution sanity check
obj = Solution(4, [2])
counter = collections.Counter()
for _ in range(6000):
counter[obj.pick()] += 1
for value in [0, 1, 3]:
assert abs(counter[value] - 2000) < 500 # roughly uniform
| Test | Why |
|---|---|
| Example input | Verifies correctness against problem statement |
| Empty blacklist | Ensures all numbers remain selectable |
| Single valid number | Tests extreme blacklist density |
| Upper-range blacklist only | Verifies unnecessary remapping is avoided |
| Lower-range blacklist only | Verifies remapping logic |
Large n |
Ensures scalability without huge memory use |
| Distribution sanity check | Confirms approximate uniform randomness |
Edge Cases
One important edge case occurs when the blacklist is empty. In this situation, every number in [0, n - 1] is valid. A buggy implementation might still attempt remapping logic unnecessarily. The current implementation handles this naturally because the mapping table remains empty, and every random value is returned directly.
Another critical case occurs when almost every number is blacklisted. For example, n = 4 and blacklist = [0,1,2]. A rejection-sampling approach would become extremely inefficient because most random values fail. The remapping strategy avoids this entirely by reducing the random range to size 1, guaranteeing immediate success.
A third subtle edge case occurs when blacklisted values already lie entirely in the upper range. For example, n = 6 and blacklist = [4,5]. Since random generation only happens in [0,3], these blacklisted values never appear anyway. The implementation correctly skips remapping for such values, avoiding unnecessary work.
Another potential source of bugs is duplicate mappings. If replacement values are not carefully selected, two blacklisted lower-range numbers could map to the same valid value, breaking uniformity. The implementation avoids this by advancing the candidate pointer only forward and never reusing a replacement value.
Finally, very large values of n could tempt developers into building enormous arrays. Since n may reach 10^9, such solutions would fail memory constraints immediately. This implementation stores only blacklist-related information, keeping memory proportional to the blacklist size rather than the full range.