LeetCode 710 - Random Pick with Blacklist

The problem asks us to design a data structure that can repeatedly return a random integer from the range [0, n - 1], while excluding all integers that appear in a blacklist. Every valid number must have exactly the same probability of being chosen.

LeetCode Problem 710

Difficulty: 🔴 Hard
Topics: Array, Hash Table, Math, Binary Search, Sorting, Randomized

Solution

Problem Understanding

The problem asks us to design a data structure that can repeatedly return a random integer from the range [0, n - 1], while excluding all integers that appear in a blacklist. Every valid number must have exactly the same probability of being chosen.

The challenge is not simply filtering numbers. The key requirement is efficiency. The value of n can be as large as 10^9, which means we cannot explicitly build or store the entire range of valid numbers. At the same time, the blacklist can contain up to 10^5 values, which is relatively manageable.

A naive implementation might repeatedly generate random numbers until it finds one that is not blacklisted. While this works conceptually, it becomes inefficient when the blacklist is large. Imagine a case where almost every number is blacklisted. The algorithm could spend many retries before finding a valid value.

The problem specifically mentions minimizing calls to the built-in random function. This strongly suggests that the ideal solution should require exactly one random call per pick() operation.

The input consists of:

  • n, the upper bound of the range
  • blacklist, a list of unique integers that cannot be returned

The expected output of pick() is a uniformly random valid number.

The constraints tell us several important things:

  • n is extremely large, so array-based solutions over the entire range are impossible
  • blacklist.length is much smaller than n
  • Up to 2 * 10^4 calls to pick() may occur, so each call should ideally be O(1)

There are several edge cases that matter:

  • The blacklist may be empty
  • The blacklist may contain almost all numbers
  • Blacklisted numbers may cluster at the beginning or end of the range
  • n may be extremely large even though the blacklist is small

The problem guarantees that blacklist values are unique and within bounds, which simplifies preprocessing.

Approaches

Brute Force Approach

The most straightforward idea is to repeatedly generate random integers in the range [0, n - 1] until we find one that is not blacklisted.

To support fast blacklist lookup, we can store all blacklisted numbers in a hash set. During pick(), we generate a random number and check whether it exists in the set. If it does, we try again.

This approach is correct because every generated number is uniformly random, and we only reject invalid choices. Eventually, we return a valid number.

However, this solution can become extremely inefficient when the blacklist is large. Suppose only one valid number exists. In that case, almost every random call fails, leading to many retries. The expected runtime becomes poor, and the number of random calls becomes excessive.

Optimal Approach

The key insight is that we do not actually need to generate numbers from the entire range [0, n - 1].

Suppose the blacklist contains b numbers. Then there are exactly n - b valid numbers. If we could somehow map the valid numbers into a compact range, we could generate random values only from that smaller space.

Define:

size = n - len(blacklist)

We generate random integers only in [0, size - 1].

Now consider two cases:

  • If a number in this range is not blacklisted, we can return it directly.
  • If it is blacklisted, we remap it to some valid number in the upper part of the range.

The upper range [size, n - 1] contains enough unused valid numbers to replace all blacklisted values that appear in the lower range.

This transforms the problem into a hash map remapping problem.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force pick(): O(expected retries) O(b) Repeatedly retries until a valid number is found
Optimal O(b) preprocessing, O(1) pick O(b) Uses remapping to guarantee one random call

Algorithm Walkthrough

Step 1: Compute the Valid Range Size

We first compute:

size = n - len(blacklist)

This tells us how many valid numbers exist overall.

Instead of generating numbers from [0, n - 1], we will only generate numbers from [0, size - 1].

This guarantees exactly one random call per pick().

Step 2: Store Blacklisted Numbers in a Set

We place all blacklist values into a hash set.

This allows constant-time lookup when determining whether a number is blacklisted.

Step 3: Identify Blacklisted Numbers in the Lower Range

Some blacklisted numbers may already lie in the upper range [size, n - 1].

Those do not matter because we never randomly generate them.

The only problematic values are blacklisted numbers less than size, because those may be randomly selected.

These values must be remapped.

Step 4: Find Replacement Values

We scan the upper portion of the range, starting from size.

Whenever we encounter a number that is not blacklisted, we use it as a replacement.

For every blacklisted value b < size, we create:

mapping[b] = replacement_value

This guarantees that every invalid lower-range value maps to a valid upper-range value.

Step 5: Generate Random Numbers

During pick():

  1. Generate a random integer x in [0, size - 1]
  2. If x is remapped, return mapping[x]
  3. Otherwise return x

This produces a perfectly uniform distribution over all valid numbers.

Why it works

The algorithm works because the lower range [0, size - 1] contains exactly as many positions as there are valid numbers overall.

Every blacklisted value in this lower range is replaced by a unique valid value from the upper range. Therefore:

  • Every random index corresponds to exactly one valid output
  • No valid number appears more than once
  • Every valid number has equal probability

This creates a one-to-one correspondence between random choices and valid outputs.

Python Solution

import random
from typing import List

class Solution:

    def __init__(self, n: int, blacklist: List[int]):
        self.size = n - len(blacklist)
        blacklist_set = set(blacklist)

        self.mapping = {}

        candidate = self.size

        for blocked in blacklist:
            if blocked < self.size:
                while candidate in blacklist_set:
                    candidate += 1

                self.mapping[blocked] = candidate
                candidate += 1

    def pick(self) -> int:
        value = random.randint(0, self.size - 1)

        if value in self.mapping:
            return self.mapping[value]

        return value

# Your Solution object will be instantiated and called as such:
# obj = Solution(n, blacklist)
# param_1 = obj.pick()

The implementation begins by computing the number of valid integers, stored in self.size. This value determines the range used during random generation.

The blacklist is converted into a set for efficient membership testing. This is important because we repeatedly check whether candidate replacement values are blacklisted.

The mapping dictionary stores remappings for blacklisted numbers in the lower range. Only numbers smaller than self.size require remapping because those are the only numbers that can be randomly selected.

The variable candidate scans the upper range. Whenever a non-blacklisted number is found, it becomes the replacement target for a lower-range blacklisted value.

The pick() method performs exactly one random call. If the generated number exists in the remapping table, the mapped value is returned. Otherwise, the number itself is already valid.

Go Solution

package main

import (
	"math/rand"
)

type Solution struct {
	size    int
	mapping map[int]int
}

func Constructor(n int, blacklist []int) Solution {
	size := n - len(blacklist)

	blackSet := make(map[int]bool)
	for _, num := range blacklist {
		blackSet[num] = true
	}

	mapping := make(map[int]int)

	candidate := size

	for _, blocked := range blacklist {
		if blocked < size {
			for blackSet[candidate] {
				candidate++
			}

			mapping[blocked] = candidate
			candidate++
		}
	}

	return Solution{
		size:    size,
		mapping: mapping,
	}
}

func (this *Solution) Pick() int {
	value := rand.Intn(this.size)

	if mapped, exists := this.mapping[value]; exists {
		return mapped
	}

	return value
}

/**
 * Your Solution object will be instantiated and called as such:
 * obj := Constructor(n, blacklist);
 * param_1 := obj.Pick();
 */

The Go implementation follows the same logic as the Python version, but uses Go maps instead of Python dictionaries and sets.

Go does not have a built-in set type, so map[int]bool is used to represent the blacklist set.

Random generation uses rand.Intn(this.size), which produces integers in [0, size).

Because n is at most 10^9, standard Go int values are sufficient on LeetCode environments.

Worked Examples

Example 1

Input:

n = 7
blacklist = [2, 3, 5]

Step 1: Compute Size

size = 7 - 3 = 4

We only generate random numbers in:

[0, 3]

Step 2: Blacklist Set

Value
2
3
5

Step 3: Identify Problematic Lower Values

The lower range is [0, 3].

Blacklisted values in this range:

Value Needs Mapping
2 Yes
3 Yes

Step 4: Find Replacement Values

Upper range starts at 4.

Candidate Blacklisted? Used?
4 No Map to 2
5 Yes Skip
6 No Map to 3

Final mapping:

Blacklisted Lower Value Replacement
2 4
3 6

Step 5: Random Picks

Random Value Returned Value
0 0
1 1
2 4
3 6

Every valid number {0,1,4,6} appears exactly once.

Complexity Analysis

Measure Complexity Explanation
Time O(b) preprocessing, O(1) pick Each blacklist value is processed once
Space O(b) Hash set and remapping table

The preprocessing phase iterates through blacklist values and potentially scans upper-range candidates. However, each candidate is visited at most once, so the total preprocessing remains linear in the blacklist size.

The pick() operation performs one random generation and at most one hash map lookup, giving constant-time performance.

Test Cases

import collections

# Example case
obj = Solution(7, [2, 3, 5])
for _ in range(100):
    assert obj.pick() in {0, 1, 4, 6}  # example validity

# Empty blacklist
obj = Solution(5, [])
for _ in range(100):
    assert obj.pick() in {0, 1, 2, 3, 4}  # all values allowed

# Single valid number
obj = Solution(4, [0, 1, 2])
for _ in range(20):
    assert obj.pick() == 3  # only one possible answer

# Blacklist only upper values
obj = Solution(6, [4, 5])
for _ in range(100):
    assert obj.pick() in {0, 1, 2, 3}  # no remapping needed

# Blacklist only lower values
obj = Solution(6, [0, 1])
for _ in range(100):
    assert obj.pick() in {2, 3, 4, 5}  # remapping required

# Large n with small blacklist
obj = Solution(10**9, [5, 10, 15])
for _ in range(100):
    value = obj.pick()
    assert value not in {5, 10, 15}  # large range support

# Distribution sanity check
obj = Solution(4, [2])
counter = collections.Counter()

for _ in range(6000):
    counter[obj.pick()] += 1

for value in [0, 1, 3]:
    assert abs(counter[value] - 2000) < 500  # roughly uniform
Test Why
Example input Verifies correctness against problem statement
Empty blacklist Ensures all numbers remain selectable
Single valid number Tests extreme blacklist density
Upper-range blacklist only Verifies unnecessary remapping is avoided
Lower-range blacklist only Verifies remapping logic
Large n Ensures scalability without huge memory use
Distribution sanity check Confirms approximate uniform randomness

Edge Cases

One important edge case occurs when the blacklist is empty. In this situation, every number in [0, n - 1] is valid. A buggy implementation might still attempt remapping logic unnecessarily. The current implementation handles this naturally because the mapping table remains empty, and every random value is returned directly.

Another critical case occurs when almost every number is blacklisted. For example, n = 4 and blacklist = [0,1,2]. A rejection-sampling approach would become extremely inefficient because most random values fail. The remapping strategy avoids this entirely by reducing the random range to size 1, guaranteeing immediate success.

A third subtle edge case occurs when blacklisted values already lie entirely in the upper range. For example, n = 6 and blacklist = [4,5]. Since random generation only happens in [0,3], these blacklisted values never appear anyway. The implementation correctly skips remapping for such values, avoiding unnecessary work.

Another potential source of bugs is duplicate mappings. If replacement values are not carefully selected, two blacklisted lower-range numbers could map to the same valid value, breaking uniformity. The implementation avoids this by advancing the candidate pointer only forward and never reusing a replacement value.

Finally, very large values of n could tempt developers into building enormous arrays. Since n may reach 10^9, such solutions would fail memory constraints immediately. This implementation stores only blacklist-related information, keeping memory proportional to the blacklist size rather than the full range.