LeetCode 1613 - Find the Missing IDs

Difficulty: 🟡 Medium
Topics: Database

Solution

Problem Understanding

The problem gives us a database table named Customers, where each row contains a unique customer_id and a corresponding customer_name.

Our task is to identify all missing customer IDs between 1 and the maximum customer_id currently present in the table.

In other words, suppose the largest customer ID in the table is max_id. We must examine every integer in the inclusive range [1, max_id] and determine which IDs do not appear in the Customers table.

For example, if the existing customer IDs are:

1, 4, 5

then the maximum ID is 5, so the full expected range is:

1, 2, 3, 4, 5

Among these values, 2 and 3 are missing, so they should be returned.

The result must contain a single column named ids, sorted in ascending order.

A very important constraint is that the maximum customer_id does not exceed 100. This tells us the search space is extremely small. Even simple approaches are efficient enough here. However, we still want to write a clean and scalable SQL solution.

Some important edge cases include:

The table may already contain every ID from 1 to max_id. In that case, the result should be empty.
The smallest ID might not start at 1. For example, if the table contains only IDs 3 and 4, then IDs 1 and 2 are considered missing and must be returned.
The table may contain only one customer. If that customer has ID 1, there are no missing IDs. If the customer has ID 5, then IDs 1, 2, 3, 4 are missing.
Since IDs are unique, we do not need to handle duplicates.

Approaches

Brute Force Approach

The brute-force idea is straightforward. First, determine the maximum customer ID in the table. Then generate every number from 1 through that maximum value. For each generated number, scan the Customers table to see whether that ID exists.

If the ID does not exist, include it in the result.

This approach is correct because it explicitly checks every possible valid ID in the required range.

However, the inefficient part is repeatedly scanning the table for every candidate ID. If the table were large, this would become expensive because each lookup could require a full table scan.

Even though the constraints here are tiny, brute force is still conceptually inefficient.

Optimal Approach

The better approach is to generate all numbers from 1 to max(customer_id) once, then use a set-based SQL operation such as LEFT JOIN or NOT EXISTS to efficiently identify missing values.

The key insight is that we are not searching for patterns or performing complicated computations. We simply need the mathematical difference between:

the complete range of valid IDs
the IDs already present in the table

Since the maximum ID is at most 100, we can conveniently generate numbers using a recursive common table expression, also called a recursive CTE.

The recursive CTE produces the sequence:

1, 2, 3, ..., max_id

Then we compare this generated sequence against the Customers table. Any generated number without a matching customer record is missing.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(M × N)	O(M)	Generate all IDs and scan the table for each one
Optimal	O(M + N)	O(M)	Generate IDs once and use set-based filtering

Here:

N is the number of rows in Customers
M is the maximum customer ID, at most 100

Algorithm Walkthrough

First, compute the maximum customer ID from the Customers table. This defines the upper bound of the valid range we must inspect.
Use a recursive CTE to generate every integer starting from 1 up to the maximum ID. This creates the complete sequence of expected customer IDs.
Perform a LEFT JOIN between the generated sequence and the Customers table using the customer ID as the join condition.
After the join, rows where the customer side is NULL represent IDs that do not exist in the original table.
Return these missing IDs in ascending order.

Why it works

The algorithm works because the recursive CTE generates every valid ID in the required range exactly once. The LEFT JOIN preserves all generated IDs, even those without matching customer records. Therefore, any generated ID that fails to match a customer row must be missing from the table.

Since every ID in [1, max_id] is checked, the result is complete and correct.

Python Solution

LeetCode Database problems are solved using SQL rather than executable Python code. The following section provides the SQL query in a Python-style code block for consistency with the requested format.

WITH RECURSIVE numbers AS (
    SELECT 1 AS ids
    
    UNION ALL
    
    SELECT ids + 1
    FROM numbers
    WHERE ids < (SELECT MAX(customer_id) FROM Customers)
)

SELECT n.ids
FROM numbers n
LEFT JOIN Customers c
    ON n.ids = c.customer_id
WHERE c.customer_id IS NULL
ORDER BY n.ids;

The solution begins by defining a recursive CTE named numbers.

The base case generates the initial value 1.

The recursive step repeatedly adds 1 to the previous value until the sequence reaches the maximum customer ID present in the table.

After generating the full sequence, the query performs a LEFT JOIN against the Customers table. Existing customer IDs successfully match rows in the table, while missing IDs produce NULL values on the customer side.

The WHERE clause filters only those unmatched rows, which correspond to missing IDs.

Finally, the result is sorted in ascending order as required.

Go Solution

Database problems on LeetCode are submitted using SQL, not Go. The equivalent SQL solution is shown below inside a Go-tagged block for completeness.

WITH RECURSIVE numbers AS (
    SELECT 1 AS ids
    
    UNION ALL
    
    SELECT ids + 1
    FROM numbers
    WHERE ids < (SELECT MAX(customer_id) FROM Customers)
)

SELECT n.ids
FROM numbers n
LEFT JOIN Customers c
    ON n.ids = c.customer_id
WHERE c.customer_id IS NULL
ORDER BY n.ids;

There are no language-specific implementation differences because the actual submission language for this problem is SQL. The logic remains identical regardless of whether the explanation is presented alongside Python or Go.

Worked Examples

Example 1

Input table:

customer_id	customer_name
1	Alice
4	Bob
5	Charlie

Step 1: Find maximum ID

max_id = 5

Step 2: Generate sequence from 1 to 5

Generated IDs
1
2
3
4
5

Step 3: Compare against Customers table

Generated ID	Exists in Customers?	Missing?
1	Yes	No
2	No	Yes
3	No	Yes
4	Yes	No
5	Yes	No

Step 4: Return missing IDs

ids
2
3

Complexity Analysis

Measure	Complexity	Explanation
Time	O(M + N)	Generate up to `M` IDs and compare against `N` customer rows
Space	O(M)	The recursive sequence stores up to `M` generated numbers

Since M ≤ 100, the solution is extremely efficient in practice.

The recursive CTE generates at most 100 rows, so both memory usage and runtime remain very small.

Test Cases

# Example case
# Missing IDs inside the range
assert find_missing_ids([1, 4, 5]) == [2, 3]

# No missing IDs
# Complete sequence already exists
assert find_missing_ids([1, 2, 3, 4]) == []

# Single customer with ID 1
# No missing values before max_id
assert find_missing_ids([1]) == []

# Single customer with larger ID
# All smaller IDs are missing
assert find_missing_ids([5]) == [1, 2, 3, 4]

# Missing IDs at the beginning
assert find_missing_ids([3, 4, 5]) == [1, 2]

# Missing IDs in the middle
assert find_missing_ids([1, 2, 5, 6]) == [3, 4]

# Alternating gaps
assert find_missing_ids([1, 3, 5, 7]) == [2, 4, 6]

# Maximum boundary example
assert find_missing_ids([100]) == list(range(1, 100))

# Dense table with one missing value
assert find_missing_ids([1, 2, 3, 5]) == [4]

# Multiple scattered gaps
assert find_missing_ids([2, 4, 7]) == [1, 3, 5, 6]

Test Summary

Test	Why
`[1, 4, 5]`	Basic example with middle gaps
`[1, 2, 3, 4]`	No missing IDs
`[1]`	Smallest valid table
`[5]`	Missing values before first present ID
`[3, 4, 5]`	Missing prefix range
`[1, 2, 5, 6]`	Consecutive missing block
`[1, 3, 5, 7]`	Alternating gaps
`[100]`	Upper boundary stress case
`[1, 2, 3, 5]`	Single missing value
`[2, 4, 7]`	Multiple scattered missing IDs

Edge Cases

Case 1: No Missing IDs

If the table already contains every ID from 1 to max_id, the result should be empty.

For example:

1, 2, 3, 4

A buggy implementation might accidentally return extra rows due to incorrect join conditions or filtering logic.

This solution handles the case correctly because only rows with NULL matches are returned. Fully matched rows are excluded.

Case 2: IDs Do Not Start at 1

The table might begin with a larger ID, such as:

3, 4, 5

In this scenario, IDs 1 and 2 are still considered missing because the required range always begins at 1.

Some incorrect solutions mistakenly start generating numbers from the minimum existing ID instead of from 1.

This implementation explicitly starts the recursive sequence at 1, guaranteeing correct behavior.

Case 3: Only One Customer Exists

If the table contains a single row:

customer_id = 5

then IDs 1, 2, 3, 4 must be returned.

A naive implementation may incorrectly assume there are no missing values when only one row exists.

The recursive generation step avoids this issue because it still constructs the full range from 1 through 5.