LeetCode 3358 - Books with NULL Ratings
This problem provides a database table named books. Each row in the table represents a single book and contains information such as the book's ID, title, author, publication year, and rating. The important detail is that the rating column can contain NULL values.
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
This problem provides a database table named books. Each row in the table represents a single book and contains information such as the book's ID, title, author, publication year, and rating.
The important detail is that the rating column can contain NULL values. In SQL, NULL represents the absence of a value, meaning the book has not been rated yet.
The task is to retrieve all books whose rating is NULL. The output should include the following columns:
book_idtitleauthorpublished_year
The results must also be sorted by book_id in ascending order.
The input represents a relational database table, not an in-memory data structure. Therefore, the solution is expected to be written in SQL rather than a traditional algorithmic programming language. The main challenge here is correctly handling NULL values, because SQL treats NULL differently from ordinary values.
One important detail is that checking for NULL in SQL cannot be done using the equality operator (=). A condition like:
rating = NULL
will never work correctly. Instead, SQL requires the special predicate:
rating IS NULL
The problem guarantees that book_id is unique, so there are no duplicate books to worry about. The dataset size is also small enough that performance is not a major concern, but it is still useful to understand the efficiency of the query.
Several edge cases are worth considering:
- All books may already have ratings, in which case the result should be empty.
- All books may have
NULLratings, in which case every row should be returned. - The table may contain only one row.
- Ratings may contain decimal values, but that does not affect the filtering logic because we only care whether the value is
NULL.
Approaches
Brute Force Approach
A brute-force style approach conceptually scans every row in the books table one by one and manually checks whether the rating field is missing.
For each row:
- If
ratingisNULL, include the row in the output. - Otherwise, skip it.
After scanning all rows, sort the results by book_id.
This approach is correct because every book is examined exactly once, ensuring that no unrated books are missed.
In SQL databases, a full table scan is often exactly how this query executes internally when there is no index on the rating column.
Optimal Approach
The optimal approach uses SQL filtering directly with the IS NULL condition.
The key observation is that SQL provides built-in support for detecting missing values. Since the problem only requires filtering rows based on whether rating is NULL, the query becomes very simple and efficient.
The solution consists of:
- Selecting the required columns.
- Filtering rows using
WHERE rating IS NULL. - Ordering the results by
book_id ASC.
This is optimal because the database engine performs the filtering internally and returns only the necessary rows.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) | O(1) | Scan every row and manually check for NULL |
| Optimal | O(n) | O(1) | Use SQL filtering with IS NULL |
Algorithm Walkthrough
- Start by selecting the required columns from the
bookstable.
We only need:
book_idtitleauthorpublished_year
The rating column itself does not need to appear in the output.
2. Filter rows where the rating value is NULL.
In SQL, NULL represents an unknown or missing value. Because of SQL's three-valued logic, NULL cannot be compared using normal equality operators.
Therefore, the correct condition is:
WHERE rating IS NULL
- Sort the filtered rows by
book_idin ascending order.
This ensures the output matches the required format.
The ordering clause is:
ORDER BY book_id ASC
- Return the final result set.
Why it works
The algorithm works because every row in the books table is evaluated against the condition rating IS NULL. Only rows with missing ratings satisfy this condition, so the result contains exactly the unrated books. The final ordering step guarantees the rows appear in ascending book_id order as required.
Python Solution
Even though this is a database problem, LeetCode database problems are solved using SQL queries. Below is the correct SQL solution.
SELECT
book_id,
title,
author,
published_year
FROM books
WHERE rating IS NULL
ORDER BY book_id ASC;
The query begins by selecting only the columns required in the final output. This avoids returning unnecessary data.
Next, the WHERE rating IS NULL clause filters the rows. This is the critical part of the solution because SQL requires IS NULL instead of = NULL.
Finally, the ORDER BY book_id ASC clause sorts the results in ascending order of book_id, matching the problem statement exactly.
Go Solution
LeetCode database problems use SQL regardless of the programming language selected. The same SQL query is therefore valid for Go submissions as well.
SELECT
book_id,
title,
author,
published_year
FROM books
WHERE rating IS NULL
ORDER BY book_id ASC;
There are no Go-specific implementation differences because this problem is solved entirely in SQL. The database engine handles the filtering and sorting operations internally.
Worked Examples
Example 1
Input table:
| book_id | title | author | published_year | rating |
|---|---|---|---|---|
| 1 | The Great Gatsby | F. Scott | 1925 | 4.5 |
| 2 | To Kill a Mockingbird | Harper Lee | 1960 | NULL |
| 3 | Pride and Prejudice | Jane Austen | 1813 | 4.8 |
| 4 | The Catcher in the Rye | J.D. Salinger | 1951 | NULL |
| 5 | Animal Farm | George Orwell | 1945 | 4.2 |
| 6 | Lord of the Flies | William Golding | 1954 | NULL |
The query evaluates each row:
| book_id | rating | rating IS NULL | Included? |
|---|---|---|---|
| 1 | 4.5 | False | No |
| 2 | NULL | True | Yes |
| 3 | 4.8 | False | No |
| 4 | NULL | True | Yes |
| 5 | 4.2 | False | No |
| 6 | NULL | True | Yes |
Rows 2, 4, and 6 satisfy the condition.
After ordering by book_id ASC, the final result is:
| book_id | title | author | published_year |
|---|---|---|---|
| 2 | To Kill a Mockingbird | Harper Lee | 1960 |
| 4 | The Catcher in the Rye | J.D. Salinger | 1951 |
| 6 | Lord of the Flies | William Golding | 1954 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | The database scans each row once to evaluate the condition |
| Space | O(1) | No extra auxiliary space is used beyond the result set |
The query performs a single pass through the table to determine whether each row satisfies the IS NULL condition. Since every row may need to be checked, the time complexity is linear in the number of rows.
The query itself does not allocate additional data structures, so the auxiliary space complexity is constant.
Test Cases
# Example case from the problem statement
# Returns books with NULL ratings
assert query_result == [
[2, "To Kill a Mockingbird", "Harper Lee", 1960],
[4, "The Catcher in the Rye", "J.D. Salinger", 1951],
[6, "Lord of the Flies", "William Golding", 1954]
]
# All books rated
# Should return an empty result
assert query_result == []
# All books unrated
# Should return every row
assert query_result == [
[1, "Book A", "Author A", 2000],
[2, "Book B", "Author B", 2001]
]
# Single row with NULL rating
# Ensures smallest valid input works
assert query_result == [
[1, "Solo Book", "Solo Author", 2020]
]
# Single row with non-NULL rating
# Ensures filtering excludes rated books
assert query_result == []
# Ratings with decimal values
# Decimal ratings should not affect NULL filtering
assert query_result == [
[3, "Unrated Book", "Author X", 2010]
]
| Test | Why |
|---|---|
| Mixed NULL and non-NULL ratings | Validates normal filtering behavior |
| All books rated | Ensures empty output is handled correctly |
| All books unrated | Ensures every row is returned when appropriate |
| Single unrated book | Validates smallest positive case |
| Single rated book | Validates smallest negative case |
| Decimal ratings present | Confirms decimal values do not interfere with filtering |
Edge Cases
One important edge case occurs when every book already has a rating. In this situation, no rows satisfy the rating IS NULL condition, so the query correctly returns an empty result set. A naive implementation using incorrect NULL comparisons might accidentally return unexpected rows.
Another important edge case is when all books have NULL ratings. In this case, every row matches the filter condition, and the query should return the entire table, ordered by book_id. The implementation handles this naturally because every row satisfies IS NULL.
A third edge case involves a table containing only one row. If that row has a NULL rating, it should appear in the result. Otherwise, the result should be empty. Since the query evaluates rows independently, both scenarios are handled correctly without any special logic.
A subtle but very common source of bugs is incorrect NULL comparison syntax. Many beginners mistakenly write:
WHERE rating = NULL
However, SQL treats NULL as an unknown value, so equality comparisons with NULL always evaluate incorrectly. The implementation avoids this issue by using the proper predicate:
WHERE rating IS NULL