LeetCode 1613 - Find the Missing IDs
The problem gives us a database table named Customers, where each row contains a unique customerid and a corresponding customername. Our task is to identify all missing customer IDs between 1 and the maximum customerid currently present in the table.
Difficulty: 🟡 Medium
Topics: Database
Solution
Problem Understanding
The problem gives us a database table named Customers, where each row contains a unique customer_id and a corresponding customer_name.
Our task is to identify all missing customer IDs between 1 and the maximum customer_id currently present in the table.
In other words, suppose the largest customer ID in the table is max_id. We must examine every integer in the inclusive range [1, max_id] and determine which IDs do not appear in the Customers table.
For example, if the existing customer IDs are:
1, 4, 5
then the maximum ID is 5, so the full expected range is:
1, 2, 3, 4, 5
Among these values, 2 and 3 are missing, so they should be returned.
The result must contain a single column named ids, sorted in ascending order.
A very important constraint is that the maximum customer_id does not exceed 100. This tells us the search space is extremely small. Even simple approaches are efficient enough here. However, we still want to write a clean and scalable SQL solution.
Some important edge cases include:
- The table may already contain every ID from
1tomax_id. In that case, the result should be empty. - The smallest ID might not start at
1. For example, if the table contains only IDs3and4, then IDs1and2are considered missing and must be returned. - The table may contain only one customer. If that customer has ID
1, there are no missing IDs. If the customer has ID5, then IDs1, 2, 3, 4are missing. - Since IDs are unique, we do not need to handle duplicates.
Approaches
Brute Force Approach
The brute-force idea is straightforward. First, determine the maximum customer ID in the table. Then generate every number from 1 through that maximum value. For each generated number, scan the Customers table to see whether that ID exists.
If the ID does not exist, include it in the result.
This approach is correct because it explicitly checks every possible valid ID in the required range.
However, the inefficient part is repeatedly scanning the table for every candidate ID. If the table were large, this would become expensive because each lookup could require a full table scan.
Even though the constraints here are tiny, brute force is still conceptually inefficient.
Optimal Approach
The better approach is to generate all numbers from 1 to max(customer_id) once, then use a set-based SQL operation such as LEFT JOIN or NOT EXISTS to efficiently identify missing values.
The key insight is that we are not searching for patterns or performing complicated computations. We simply need the mathematical difference between:
- the complete range of valid IDs
- the IDs already present in the table
Since the maximum ID is at most 100, we can conveniently generate numbers using a recursive common table expression, also called a recursive CTE.
The recursive CTE produces the sequence:
1, 2, 3, ..., max_id
Then we compare this generated sequence against the Customers table. Any generated number without a matching customer record is missing.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(M × N) | O(M) | Generate all IDs and scan the table for each one |
| Optimal | O(M + N) | O(M) | Generate IDs once and use set-based filtering |
Here:
Nis the number of rows inCustomersMis the maximum customer ID, at most100
Algorithm Walkthrough
- First, compute the maximum customer ID from the
Customerstable. This defines the upper bound of the valid range we must inspect. - Use a recursive CTE to generate every integer starting from
1up to the maximum ID. This creates the complete sequence of expected customer IDs. - Perform a
LEFT JOINbetween the generated sequence and theCustomerstable using the customer ID as the join condition. - After the join, rows where the customer side is
NULLrepresent IDs that do not exist in the original table. - Return these missing IDs in ascending order.
Why it works
The algorithm works because the recursive CTE generates every valid ID in the required range exactly once. The LEFT JOIN preserves all generated IDs, even those without matching customer records. Therefore, any generated ID that fails to match a customer row must be missing from the table.
Since every ID in [1, max_id] is checked, the result is complete and correct.
Python Solution
LeetCode Database problems are solved using SQL rather than executable Python code. The following section provides the SQL query in a Python-style code block for consistency with the requested format.
WITH RECURSIVE numbers AS (
SELECT 1 AS ids
UNION ALL
SELECT ids + 1
FROM numbers
WHERE ids < (SELECT MAX(customer_id) FROM Customers)
)
SELECT n.ids
FROM numbers n
LEFT JOIN Customers c
ON n.ids = c.customer_id
WHERE c.customer_id IS NULL
ORDER BY n.ids;
The solution begins by defining a recursive CTE named numbers.
The base case generates the initial value 1.
The recursive step repeatedly adds 1 to the previous value until the sequence reaches the maximum customer ID present in the table.
After generating the full sequence, the query performs a LEFT JOIN against the Customers table. Existing customer IDs successfully match rows in the table, while missing IDs produce NULL values on the customer side.
The WHERE clause filters only those unmatched rows, which correspond to missing IDs.
Finally, the result is sorted in ascending order as required.
Go Solution
Database problems on LeetCode are submitted using SQL, not Go. The equivalent SQL solution is shown below inside a Go-tagged block for completeness.
WITH RECURSIVE numbers AS (
SELECT 1 AS ids
UNION ALL
SELECT ids + 1
FROM numbers
WHERE ids < (SELECT MAX(customer_id) FROM Customers)
)
SELECT n.ids
FROM numbers n
LEFT JOIN Customers c
ON n.ids = c.customer_id
WHERE c.customer_id IS NULL
ORDER BY n.ids;
There are no language-specific implementation differences because the actual submission language for this problem is SQL. The logic remains identical regardless of whether the explanation is presented alongside Python or Go.
Worked Examples
Example 1
Input table:
| customer_id | customer_name |
|---|---|
| 1 | Alice |
| 4 | Bob |
| 5 | Charlie |
Step 1: Find maximum ID
max_id = 5
Step 2: Generate sequence from 1 to 5
| Generated IDs |
|---|
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
Step 3: Compare against Customers table
| Generated ID | Exists in Customers? | Missing? |
|---|---|---|
| 1 | Yes | No |
| 2 | No | Yes |
| 3 | No | Yes |
| 4 | Yes | No |
| 5 | Yes | No |
Step 4: Return missing IDs
| ids |
|---|
| 2 |
| 3 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(M + N) | Generate up to M IDs and compare against N customer rows |
| Space | O(M) | The recursive sequence stores up to M generated numbers |
Since M ≤ 100, the solution is extremely efficient in practice.
The recursive CTE generates at most 100 rows, so both memory usage and runtime remain very small.
Test Cases
# Example case
# Missing IDs inside the range
assert find_missing_ids([1, 4, 5]) == [2, 3]
# No missing IDs
# Complete sequence already exists
assert find_missing_ids([1, 2, 3, 4]) == []
# Single customer with ID 1
# No missing values before max_id
assert find_missing_ids([1]) == []
# Single customer with larger ID
# All smaller IDs are missing
assert find_missing_ids([5]) == [1, 2, 3, 4]
# Missing IDs at the beginning
assert find_missing_ids([3, 4, 5]) == [1, 2]
# Missing IDs in the middle
assert find_missing_ids([1, 2, 5, 6]) == [3, 4]
# Alternating gaps
assert find_missing_ids([1, 3, 5, 7]) == [2, 4, 6]
# Maximum boundary example
assert find_missing_ids([100]) == list(range(1, 100))
# Dense table with one missing value
assert find_missing_ids([1, 2, 3, 5]) == [4]
# Multiple scattered gaps
assert find_missing_ids([2, 4, 7]) == [1, 3, 5, 6]
Test Summary
| Test | Why |
|---|---|
[1, 4, 5] |
Basic example with middle gaps |
[1, 2, 3, 4] |
No missing IDs |
[1] |
Smallest valid table |
[5] |
Missing values before first present ID |
[3, 4, 5] |
Missing prefix range |
[1, 2, 5, 6] |
Consecutive missing block |
[1, 3, 5, 7] |
Alternating gaps |
[100] |
Upper boundary stress case |
[1, 2, 3, 5] |
Single missing value |
[2, 4, 7] |
Multiple scattered missing IDs |
Edge Cases
Case 1: No Missing IDs
If the table already contains every ID from 1 to max_id, the result should be empty.
For example:
1, 2, 3, 4
A buggy implementation might accidentally return extra rows due to incorrect join conditions or filtering logic.
This solution handles the case correctly because only rows with NULL matches are returned. Fully matched rows are excluded.
Case 2: IDs Do Not Start at 1
The table might begin with a larger ID, such as:
3, 4, 5
In this scenario, IDs 1 and 2 are still considered missing because the required range always begins at 1.
Some incorrect solutions mistakenly start generating numbers from the minimum existing ID instead of from 1.
This implementation explicitly starts the recursive sequence at 1, guaranteeing correct behavior.
Case 3: Only One Customer Exists
If the table contains a single row:
customer_id = 5
then IDs 1, 2, 3, 4 must be returned.
A naive implementation may incorrectly assume there are no missing values when only one row exists.
The recursive generation step avoids this issue because it still constructs the full range from 1 through 5.