LeetCode 1069 - Product Sales Analysis II
This problem asks us to calculate the total quantity sold for each product based on records in the Sales table. The Sales table contains transaction level information.
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
This problem asks us to calculate the total quantity sold for each product based on records in the Sales table.
The Sales table contains transaction level information. Each row represents a sale of a specific product in a given year, along with the quantity sold and the unit price. Since a product can appear in multiple rows, potentially across different years, we need to combine all those rows and compute the total quantity sold per product_id.
The Product table contains product names mapped to product_id, but the problem only asks us to report:
product_idtotal_quantity
Since the output does not require product_name, we do not actually need to use the Product table in our solution.
In other words, the task is to group all rows in the Sales table by product_id and sum the quantity column for each group.
For example, if product 100 appears in multiple sales records with quantities 10 and 12, then its total quantity sold becomes:
$$10 + 12 = 22$$
The problem guarantees that (sale_id, year) is unique, meaning each sales record is distinct. Also, product_id in Sales is a valid foreign key into Product, so every referenced product exists.
An important detail is that only products with sales should appear in the result. Even if a product exists in the Product table, such as 300 in the example, if there are no matching sales records, it should not be included in the output.
A naive implementation could mistakenly include all products from the Product table or fail to aggregate multiple sales entries correctly. Another common mistake would be grouping by both product_id and year, which would incorrectly produce multiple rows per product instead of a single total.
Since this is a database aggregation problem, the expected solution is straightforward and efficient using SQL aggregation.
Approaches
Brute Force Approach
A brute force way to solve this problem would be to manually process every sales record and repeatedly scan the table to accumulate totals for each product.
For example, for every unique product_id, we could iterate through the entire Sales table and sum all matching quantities. While this produces the correct answer, it is inefficient because we repeatedly revisit the same rows.
If there are n sales records and m unique products, this could take O(n × m) time in the worst case.
The approach works because every sale record contributes exactly once to its corresponding product total, but the repeated scanning makes it unnecessarily expensive.
Optimal Approach
The key observation is that this is a classic grouping and aggregation problem.
We can process the Sales table in a single pass by grouping rows with the same product_id and summing the quantity values. SQL provides this directly through GROUP BY combined with the SUM() aggregate function.
This works because all rows for the same product belong to the same logical group, and summing quantities within each group gives the desired total quantity sold.
Since the Product table is not needed for the requested output, we avoid an unnecessary join and keep the query simple and efficient.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n × m) | O(m) | Repeatedly scans sales records for each product |
| Optimal | O(n) | O(m) | Single aggregation using GROUP BY |
Where:
n= number of rows inSalesm= number of distinctproduct_idvalues
Algorithm Walkthrough
Optimal Algorithm
- Read all rows from the
Salestable.
Each row contains a product_id and a quantity. Since multiple rows may correspond to the same product, we must combine them.
2. Group rows by product_id.
The purpose of grouping is to collect all sales records belonging to the same product into one logical bucket.
3. Compute the sum of quantity for each group.
Once rows are grouped, we calculate the total quantity sold for each product by adding together all quantities in that group. 4. Return the result.
The output should contain one row per product_id with the computed total_quantity.
The SQL implementation becomes very concise because relational databases are optimized for aggregation operations.
Why it works
The algorithm works because every sale record contributes exactly one quantity value to exactly one product_id. By grouping all rows belonging to the same product and summing their quantities, we ensure every sale is counted once and only once. Since no product outside the Sales table contributes any quantity, the output naturally contains only products that were actually sold.
Python Solution
LeetCode Database problems use SQL rather than Python, but if we model the logic programmatically, the equivalent implementation would look like this:
from collections import defaultdict
from typing import List
class Solution:
def totalQuantity(self, sales: List[List[int]]) -> List[List[int]]:
quantity_by_product = defaultdict(int)
for _, product_id, _, quantity, _ in sales:
quantity_by_product[product_id] += quantity
return [
[product_id, total_quantity]
for product_id, total_quantity in quantity_by_product.items()
]
The implementation uses a hash map, specifically a defaultdict(int), to accumulate total quantities per product. As we iterate through the sales records, we extract the product_id and quantity, then add the quantity to that product's running total.
Because dictionary lookup and insertion are constant time on average, the aggregation runs efficiently in a single pass.
For the actual LeetCode submission, the expected answer is SQL:
SELECT
product_id,
SUM(quantity) AS total_quantity
FROM Sales
GROUP BY product_id;
The query groups rows sharing the same product_id and computes the total quantity sold using SUM(quantity). The alias total_quantity matches the required output format.
Go Solution
Since this is a database problem, Go is not used in the actual LeetCode submission. However, the equivalent aggregation logic in Go would look like this:
package main
func totalQuantity(sales [][]int) [][]int {
productTotals := make(map[int]int)
for _, sale := range sales {
productID := sale[1]
quantity := sale[3]
productTotals[productID] += quantity
}
result := [][]int{}
for productID, totalQuantity := range productTotals {
result = append(result, []int{
productID,
totalQuantity,
})
}
return result
}
The Go version uses a map[int]int to store the running total for each product_id. Maps in Go provide average O(1) insertion and lookup, making them a good fit for aggregation problems.
Unlike Python's defaultdict, Go maps automatically return the zero value for missing keys, so incrementing a nonexistent key works naturally. Since the problem allows returning results in any order, there is no need to sort the output.
For the actual LeetCode problem, the SQL solution remains:
SELECT
product_id,
SUM(quantity) AS total_quantity
FROM Sales
GROUP BY product_id;
Worked Examples
Example 1
Input:
Sales
| sale_id | product_id | year | quantity | price |
|---|---|---|---|---|
| 1 | 100 | 2008 | 10 | 5000 |
| 2 | 100 | 2009 | 12 | 5000 |
| 7 | 200 | 2011 | 15 | 9000 |
We maintain a running total for each product.
| Step | Current Sale | Running Totals |
|---|---|---|
| 1 | (100, 10) |
{100: 10} |
| 2 | (100, 12) |
{100: 22} |
| 3 | (200, 15) |
{100: 22, 200: 15} |
Final result:
| product_id | total_quantity |
|---|---|
| 100 | 22 |
| 200 | 15 |
Notice that product 300 does not appear because it has no sales records.
Using SQL, the grouping works as follows:
| product_id | Quantities Found | Sum |
|---|---|---|
| 100 | 10, 12 |
22 |
| 200 | 15 |
15 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each sales row is processed once |
| Space | O(m) | Stores totals for distinct products |
The algorithm scans the sales data one time and updates a running total for each product. Since every row contributes exactly once to a hash map entry, the time complexity is linear in the number of sales records.
The space complexity depends on how many distinct products appear in the sales table. In the worst case, every sale could involve a different product, requiring O(m) additional storage.
For the SQL solution, database engines typically implement GROUP BY using hashing or sorting internally, leading to near linear performance for aggregation workloads.
Test Cases
from collections import defaultdict
def total_quantity(sales):
quantity_by_product = defaultdict(int)
for _, product_id, _, quantity, _ in sales:
quantity_by_product[product_id] += quantity
return sorted(quantity_by_product.items())
# Example from problem statement
assert total_quantity([
[1, 100, 2008, 10, 5000],
[2, 100, 2009, 12, 5000],
[7, 200, 2011, 15, 9000]
]) == [(100, 22), (200, 15)] # multiple years per product
# Single sale record
assert total_quantity([
[1, 500, 2020, 8, 100]
]) == [(500, 8)] # minimum meaningful input
# Multiple products with one sale each
assert total_quantity([
[1, 100, 2020, 5, 100],
[2, 200, 2020, 10, 200],
[3, 300, 2020, 15, 300]
]) == [(100, 5), (200, 10), (300, 15)] # independent aggregation
# Same product across many years
assert total_quantity([
[1, 100, 2018, 3, 50],
[2, 100, 2019, 4, 50],
[3, 100, 2020, 5, 50]
]) == [(100, 12)] # aggregation across years
# Multiple products with repeated entries
assert total_quantity([
[1, 10, 2020, 2, 100],
[2, 20, 2021, 3, 200],
[3, 10, 2022, 4, 100],
[4, 20, 2023, 5, 200]
]) == [(10, 6), (20, 8)] # repeated aggregation
| Test | Why |
|---|---|
| Example input | Validates correctness against the official sample |
| Single sale record | Ensures smallest valid input works |
| Multiple unique products | Verifies separate grouping |
| Same product across years | Confirms aggregation ignores year |
| Repeated product entries | Ensures totals accumulate correctly |
Edge Cases
Products with no sales
A product may exist in the Product table but never appear in Sales. This can easily lead to bugs if someone mistakenly performs a full join or iterates over every product. The problem only wants products that were actually sold. Since the implementation aggregates directly from the Sales table, products without sales are automatically excluded.
Multiple sales across different years
The same product can appear in several years. A common mistake would be grouping by both product_id and year, which would create multiple rows for the same product. Our implementation groups only by product_id, ensuring quantities across all years are combined correctly.
Multiple sales in the same year
A product might have several separate sales records within the same year. A naive solution could accidentally overwrite earlier quantities instead of summing them. Since we continuously add to a running total using SUM(quantity) in SQL or a hash map in code, every sale contributes correctly to the final total.