LeetCode 3358 - Books with NULL Ratings

This problem provides a database table named books. Each row in the table represents a single book and contains information such as the book's ID, title, author, publication year, and rating. The important detail is that the rating column can contain NULL values.

LeetCode Problem 3358

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

This problem provides a database table named books. Each row in the table represents a single book and contains information such as the book's ID, title, author, publication year, and rating.

The important detail is that the rating column can contain NULL values. In SQL, NULL represents the absence of a value, meaning the book has not been rated yet.

The task is to retrieve all books whose rating is NULL. The output should include the following columns:

  • book_id
  • title
  • author
  • published_year

The results must also be sorted by book_id in ascending order.

The input represents a relational database table, not an in-memory data structure. Therefore, the solution is expected to be written in SQL rather than a traditional algorithmic programming language. The main challenge here is correctly handling NULL values, because SQL treats NULL differently from ordinary values.

One important detail is that checking for NULL in SQL cannot be done using the equality operator (=). A condition like:

rating = NULL

will never work correctly. Instead, SQL requires the special predicate:

rating IS NULL

The problem guarantees that book_id is unique, so there are no duplicate books to worry about. The dataset size is also small enough that performance is not a major concern, but it is still useful to understand the efficiency of the query.

Several edge cases are worth considering:

  • All books may already have ratings, in which case the result should be empty.
  • All books may have NULL ratings, in which case every row should be returned.
  • The table may contain only one row.
  • Ratings may contain decimal values, but that does not affect the filtering logic because we only care whether the value is NULL.

Approaches

Brute Force Approach

A brute-force style approach conceptually scans every row in the books table one by one and manually checks whether the rating field is missing.

For each row:

  • If rating is NULL, include the row in the output.
  • Otherwise, skip it.

After scanning all rows, sort the results by book_id.

This approach is correct because every book is examined exactly once, ensuring that no unrated books are missed.

In SQL databases, a full table scan is often exactly how this query executes internally when there is no index on the rating column.

Optimal Approach

The optimal approach uses SQL filtering directly with the IS NULL condition.

The key observation is that SQL provides built-in support for detecting missing values. Since the problem only requires filtering rows based on whether rating is NULL, the query becomes very simple and efficient.

The solution consists of:

  1. Selecting the required columns.
  2. Filtering rows using WHERE rating IS NULL.
  3. Ordering the results by book_id ASC.

This is optimal because the database engine performs the filtering internally and returns only the necessary rows.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(1) Scan every row and manually check for NULL
Optimal O(n) O(1) Use SQL filtering with IS NULL

Algorithm Walkthrough

  1. Start by selecting the required columns from the books table.

We only need:

  • book_id
  • title
  • author
  • published_year

The rating column itself does not need to appear in the output. 2. Filter rows where the rating value is NULL.

In SQL, NULL represents an unknown or missing value. Because of SQL's three-valued logic, NULL cannot be compared using normal equality operators.

Therefore, the correct condition is:

WHERE rating IS NULL
  1. Sort the filtered rows by book_id in ascending order.

This ensures the output matches the required format.

The ordering clause is:

ORDER BY book_id ASC
  1. Return the final result set.

Why it works

The algorithm works because every row in the books table is evaluated against the condition rating IS NULL. Only rows with missing ratings satisfy this condition, so the result contains exactly the unrated books. The final ordering step guarantees the rows appear in ascending book_id order as required.

Python Solution

Even though this is a database problem, LeetCode database problems are solved using SQL queries. Below is the correct SQL solution.

SELECT
    book_id,
    title,
    author,
    published_year
FROM books
WHERE rating IS NULL
ORDER BY book_id ASC;

The query begins by selecting only the columns required in the final output. This avoids returning unnecessary data.

Next, the WHERE rating IS NULL clause filters the rows. This is the critical part of the solution because SQL requires IS NULL instead of = NULL.

Finally, the ORDER BY book_id ASC clause sorts the results in ascending order of book_id, matching the problem statement exactly.

Go Solution

LeetCode database problems use SQL regardless of the programming language selected. The same SQL query is therefore valid for Go submissions as well.

SELECT
    book_id,
    title,
    author,
    published_year
FROM books
WHERE rating IS NULL
ORDER BY book_id ASC;

There are no Go-specific implementation differences because this problem is solved entirely in SQL. The database engine handles the filtering and sorting operations internally.

Worked Examples

Example 1

Input table:

book_id title author published_year rating
1 The Great Gatsby F. Scott 1925 4.5
2 To Kill a Mockingbird Harper Lee 1960 NULL
3 Pride and Prejudice Jane Austen 1813 4.8
4 The Catcher in the Rye J.D. Salinger 1951 NULL
5 Animal Farm George Orwell 1945 4.2
6 Lord of the Flies William Golding 1954 NULL

The query evaluates each row:

book_id rating rating IS NULL Included?
1 4.5 False No
2 NULL True Yes
3 4.8 False No
4 NULL True Yes
5 4.2 False No
6 NULL True Yes

Rows 2, 4, and 6 satisfy the condition.

After ordering by book_id ASC, the final result is:

book_id title author published_year
2 To Kill a Mockingbird Harper Lee 1960
4 The Catcher in the Rye J.D. Salinger 1951
6 Lord of the Flies William Golding 1954

Complexity Analysis

Measure Complexity Explanation
Time O(n) The database scans each row once to evaluate the condition
Space O(1) No extra auxiliary space is used beyond the result set

The query performs a single pass through the table to determine whether each row satisfies the IS NULL condition. Since every row may need to be checked, the time complexity is linear in the number of rows.

The query itself does not allocate additional data structures, so the auxiliary space complexity is constant.

Test Cases

# Example case from the problem statement
# Returns books with NULL ratings
assert query_result == [
    [2, "To Kill a Mockingbird", "Harper Lee", 1960],
    [4, "The Catcher in the Rye", "J.D. Salinger", 1951],
    [6, "Lord of the Flies", "William Golding", 1954]
]

# All books rated
# Should return an empty result
assert query_result == []

# All books unrated
# Should return every row
assert query_result == [
    [1, "Book A", "Author A", 2000],
    [2, "Book B", "Author B", 2001]
]

# Single row with NULL rating
# Ensures smallest valid input works
assert query_result == [
    [1, "Solo Book", "Solo Author", 2020]
]

# Single row with non-NULL rating
# Ensures filtering excludes rated books
assert query_result == []

# Ratings with decimal values
# Decimal ratings should not affect NULL filtering
assert query_result == [
    [3, "Unrated Book", "Author X", 2010]
]
Test Why
Mixed NULL and non-NULL ratings Validates normal filtering behavior
All books rated Ensures empty output is handled correctly
All books unrated Ensures every row is returned when appropriate
Single unrated book Validates smallest positive case
Single rated book Validates smallest negative case
Decimal ratings present Confirms decimal values do not interfere with filtering

Edge Cases

One important edge case occurs when every book already has a rating. In this situation, no rows satisfy the rating IS NULL condition, so the query correctly returns an empty result set. A naive implementation using incorrect NULL comparisons might accidentally return unexpected rows.

Another important edge case is when all books have NULL ratings. In this case, every row matches the filter condition, and the query should return the entire table, ordered by book_id. The implementation handles this naturally because every row satisfies IS NULL.

A third edge case involves a table containing only one row. If that row has a NULL rating, it should appear in the result. Otherwise, the result should be empty. Since the query evaluates rows independently, both scenarios are handled correctly without any special logic.

A subtle but very common source of bugs is incorrect NULL comparison syntax. Many beginners mistakenly write:

WHERE rating = NULL

However, SQL treats NULL as an unknown value, so equality comparisons with NULL always evaluate incorrectly. The implementation avoids this issue by using the proper predicate:

WHERE rating IS NULL