LeetCode 1098 - Unpopular Books
The problem asks us to identify books from a Books table that are considered unpopular based on their sales in the last year, relative to a fixed "today" date of 2019-06-23.
Difficulty: 🟡 Medium
Topics: Database
Solution
Problem Understanding
The problem asks us to identify books from a Books table that are considered unpopular based on their sales in the last year, relative to a fixed "today" date of 2019-06-23. Specifically, a book is unpopular if it sold less than 10 copies in the period from 2018-06-23 to 2019-06-23. Additionally, we must exclude books that have been available for less than one month before 2019-06-23.
The input consists of two tables. The Books table contains book_id, name, and the date the book became available (available_from). The Orders table contains order_id, book_id (as a foreign key), quantity of each order, and the dispatch_date. The output is a table with book_id and name for all books meeting the above conditions.
Key edge cases include books that have never been sold, books sold exactly 10 copies, books available for less than one month, and books with orders outside the one-year window. The problem guarantees that book_id is unique in Books and that book_id in Orders references Books, so we don’t have to handle missing references.
Approaches
The brute-force approach would involve computing sales for each book by iterating over all orders in the last year, summing the quantity, and then filtering out books with sales below 10. We would also have to check the available_from date for each book. This works correctly but is inefficient because it processes all orders individually and performs filtering in multiple steps.
The optimal approach uses SQL aggregation and filtering. We first filter the Books table to exclude books available less than one month ago. Then, we join it with Orders using a left join to handle books with zero orders. We aggregate sales over the last year for each book and use a HAVING clause to select books with total sales below 10. This approach leverages SQL's set-based operations, reducing redundant scanning and providing a single query solution.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(B * O) | O(B) | Iterate each book and sum quantities of relevant orders; slow for large tables |
| Optimal | O(B + O) | O(B) | Aggregate sales in a single query using LEFT JOIN and GROUP BY; efficient for large datasets |
Algorithm Walkthrough
- Filter books by availability: Exclude any book whose
available_fromis later than one month before the reference date2019-06-23. This ensures we only consider books that have been available long enough to accumulate meaningful sales data. - Filter orders in the last year: Limit
Ordersto those withdispatch_datewithin one year before the reference date, i.e.,2018-06-23to2019-06-23. This ensures we are counting only recent sales. - Join books with relevant orders: Perform a left join of the filtered
Bookswith filteredOrdersto include books that may have zero sales in the last year. - Aggregate total sales per book: Use
SUM(quantity)grouped bybook_idto compute total copies sold for each book in the last year. If a book has no orders, treat its total sales as zero. - Filter unpopular books: Use
HAVINGto select books with total sales less than 10. - Select result columns: Return
book_idandnameof the filtered books.
Why it works: The approach ensures all conditions are applied exactly once. Books excluded for being too new never enter the join. Aggregating with a left join ensures books with zero orders are included. Filtering on SUM(quantity) < 10 correctly identifies unpopular books.
Python Solution
import sqlite3
def unpopular_books(conn: sqlite3.Connection):
query = """
WITH FilteredBooks AS (
SELECT book_id, name
FROM Books
WHERE available_from <= date('2019-06-23', '-1 month')
),
LastYearOrders AS (
SELECT book_id, SUM(quantity) AS total_sold
FROM Orders
WHERE dispatch_date BETWEEN date('2019-06-23', '-1 year') AND '2019-06-23'
GROUP BY book_id
)
SELECT fb.book_id, fb.name
FROM FilteredBooks fb
LEFT JOIN LastYearOrders lyo ON fb.book_id = lyo.book_id
WHERE COALESCE(lyo.total_sold, 0) < 10
"""
cursor = conn.cursor()
cursor.execute(query)
return cursor.fetchall()
Explanation: The FilteredBooks CTE removes books available for less than one month. The LastYearOrders CTE aggregates orders in the last year. A left join ensures all books are included, even with zero sales. COALESCE handles null values for books with no orders, treating them as zero sales. The WHERE clause filters books sold fewer than 10 copies.
Go Solution
package main
import (
"database/sql"
)
func UnpopularBooks(db *sql.DB) (*sql.Rows, error) {
query := `
WITH FilteredBooks AS (
SELECT book_id, name
FROM Books
WHERE available_from <= date('2019-06-23', '-1 month')
),
LastYearOrders AS (
SELECT book_id, SUM(quantity) AS total_sold
FROM Orders
WHERE dispatch_date BETWEEN date('2019-06-23', '-1 year') AND '2019-06-23'
GROUP BY book_id
)
SELECT fb.book_id, fb.name
FROM FilteredBooks fb
LEFT JOIN LastYearOrders lyo ON fb.book_id = lyo.book_id
WHERE COALESCE(lyo.total_sold, 0) < 10
`
return db.Query(query)
}
Go-specific notes: The main difference is using *sql.Rows for returning results. We rely on the database to compute aggregation and joins. Go does not require handling null explicitly at the query level because COALESCE handles it.
Worked Examples
Example 1
Input Books and Orders tables:
Books:
1, "Kalila And Demna", 2010-01-01
2, "28 Letters", 2012-05-12
3, "The Hobbit", 2019-06-10
4, "13 Reasons Why", 2019-06-01
5, "The Hunger Games", 2008-09-21
Orders:
1, 1, 2, 2018-07-26
2, 1, 1, 2018-11-05
3, 3, 8, 2019-06-11
4, 4, 6, 2019-06-05
5, 4, 5, 2019-06-20
6, 5, 9, 2009-02-02
7, 5, 8, 2010-04-13
Step 1: Filter books available before 2019-05-23 (one month before 2019-06-23). Excluded books: 3, 4.
Step 2: Filter orders in last year (2018-06-23 to 2019-06-23). Included orders: 1, 2, 3, 4, 5. Excluded orders: 6, 7.
Step 3: Left join FilteredBooks with these orders.
Step 4: Aggregate total sales:
Book 1: 2 + 1 = 3
Book 2: 0
Book 5: 0
Step 5: Filter total_sold < 10. Result books: 1, 2, 5.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(B + O) | Each book and order is scanned once; SQL aggregation handles grouping efficiently |
| Space | O(B) | Space for aggregated sums and intermediate CTEs |
The approach is efficient because it avoids nested loops and leverages SQL set operations, making it suitable for large tables.
Test Cases
# test cases
import sqlite3
conn = sqlite3.connect(":memory:")
# setup tables
conn.execute("CREATE TABLE Books (book_id INT, name TEXT, available_from DATE)")
conn.execute("CREATE TABLE Orders (order_id INT, book_id INT, quantity INT, dispatch_date DATE)")
# insert example data
books_data = [
(1, "Kalila And Demna", "2010-01-01"),
(2, "28 Letters", "2012-05-12"),
(3, "The Hobbit", "2019-06-10"),
(4, "13 Reasons Why", "2019-06-01"),
(5, "The Hunger Games", "2008-09-21")
]
orders_data = [
(1, 1, 2, "2018-07-26"),
(2, 1, 1, "2018-11-05"),
(3, 3