LeetCode 1479 - Sales by Day of the Week

The problem asks us to generate a sales report that summarizes the total quantity of items sold for each category on eac

LeetCode Problem 1479

Difficulty: 🔴 Hard
Topics: Database

Solution

Problem Understanding

The problem asks us to generate a sales report that summarizes the total quantity of items sold for each category on each day of the week. The data is provided in two tables: Orders and Items. The Orders table contains transactional information, including which customer bought which item, in what quantity, and on what date. The Items table maps each item to its category and name. The goal is to produce a table where each row corresponds to a category and each column corresponds to a day of the week (Monday through Sunday), showing the total units sold in that category on that day.

The output must be ordered by category, and days with no sales should show zero. Notably, even if a category has no orders, it must still appear in the output with zeros for all days. The problem highlights that the input tables may include items that are never ordered, which should still be part of the final report.

Key points and constraints:

  • The day of the week is derived from the order_date.
  • Categories without orders must still be included.
  • Each (order_id, item_id) pair is unique in Orders.
  • The database can contain multiple orders for the same category on the same day, so summing the quantities is essential.
  • SQL functions like DAYOFWEEK or DATE_FORMAT may vary by database, but the logic remains consistent.

Important edge cases include categories with no orders, orders on the same day for different items in the same category, and sparse dates with no orders at all.

Approaches

The brute-force approach is to iterate over each category and for each day of the week query the sum of quantities from the Orders table. This method is correct but highly inefficient because it requires multiple queries proportional to the number of categories times seven days.

The optimal approach is to join the Orders table with the Items table on item_id, extract the day of the week for each order, group the results by category and day, and pivot the aggregated quantities so that each day becomes a column. This approach is efficient because it uses a single aggregation query and leverages SQL’s GROUP BY and conditional aggregation to produce the pivot table in one pass.

Approach Time Complexity Space Complexity Notes
Brute Force O(C * D * O) O(C * D) Query each category and day individually; inefficient for large datasets
Optimal O(O + C) O(C * D) Join and aggregate in one query, then pivot; highly efficient

Where C is the number of categories, D is 7 (days of the week), and O is the number of orders.

Algorithm Walkthrough

  1. Join the Orders table with the Items table on item_id to associate each order with its item category. This ensures we can aggregate by category.
  2. Use a SQL function to extract the day of the week from order_date. The exact function may vary (DAYOFWEEK in MySQL, TO_CHAR in PostgreSQL).
  3. Group the joined table by item_category and day of the week, summing the quantity column to get total sales per category per day.
  4. Pivot the grouped data so that each day of the week becomes a column. This can be done with conditional aggregation: sum the quantity only if the day matches a specific day.
  5. Ensure all categories appear in the final result. If a category has no sales for a day, the sum should default to zero.
  6. Order the final result by category.

Why it works: By joining the orders with their categories, summing quantities grouped by category and day, and pivoting, we guarantee that each category gets a row with all days represented. The sum defaults to zero when no orders exist for a category-day pair, fulfilling the problem constraints.

Python Solution

class Solution:
    def salesByDayOfWeek(self, orders: 'List[Dict]', items: 'List[Dict]') -> 'List[Dict]':
        import pandas as pd

        # Convert lists of dicts into DataFrames
        orders_df = pd.DataFrame(orders)
        items_df = pd.DataFrame(items)

        # Merge orders with items to get categories
        merged_df = orders_df.merge(items_df, on='item_id', how='right')

        # Fill missing quantities with 0
        merged_df['quantity'] = merged_df['quantity'].fillna(0)

        # Extract day name from order_date
        merged_df['day_of_week'] = pd.to_datetime(merged_df['order_date']).dt.day_name()

        # Define all days to ensure all columns are present
        days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

        # Group by category and day, sum quantities
        pivot_df = merged_df.pivot_table(index='item_category',
                                         columns='day_of_week',
                                         values='quantity',
                                         aggfunc='sum',
                                         fill_value=0).reindex(columns=days, fill_value=0).reset_index()

        # Rename index column to 'Category'
        pivot_df.rename(columns={'item_category': 'Category'}, inplace=True)

        return pivot_df.to_dict('records')

The code first merges orders with item categories, then converts order_date into the weekday name. Using a pivot table, it sums the quantities for each day per category, filling missing values with zero. The columns are ordered from Monday to Sunday, and categories without orders are included.

Go Solution

package main

import (
    "database/sql"
    _ "github.com/go-sql-driver/mysql"
)

func SalesByDayOfWeek(db *sql.DB) (*sql.Rows, error) {
    query := `
        SELECT
            i.item_category AS Category,
            SUM(CASE WHEN DAYOFWEEK(o.order_date) = 2 THEN o.quantity ELSE 0 END) AS Monday,
            SUM(CASE WHEN DAYOFWEEK(o.order_date) = 3 THEN o.quantity ELSE 0 END) AS Tuesday,
            SUM(CASE WHEN DAYOFWEEK(o.order_date) = 4 THEN o.quantity ELSE 0 END) AS Wednesday,
            SUM(CASE WHEN DAYOFWEEK(o.order_date) = 5 THEN o.quantity ELSE 0 END) AS Thursday,
            SUM(CASE WHEN DAYOFWEEK(o.order_date) = 6 THEN o.quantity ELSE 0 END) AS Friday,
            SUM(CASE WHEN DAYOFWEEK(o.order_date) = 7 THEN o.quantity ELSE 0 END) AS Saturday,
            SUM(CASE WHEN DAYOFWEEK(o.order_date) = 1 THEN o.quantity ELSE 0 END) AS Sunday
        FROM Items i
        LEFT JOIN Orders o ON i.item_id = o.item_id
        GROUP BY i.item_category
        ORDER BY i.item_category
    `
    return db.Query(query)
}

In Go, the solution uses SQL directly via database/sql. The DAYOFWEEK function maps days starting with Sunday as 1, so the cases are adjusted accordingly. A left join ensures all categories are included even with no orders.

Worked Examples

Using the example in the problem:

  • Book has orders on Monday (10+10), Tuesday (5), and Friday (10). Other days are 0.
  • Phone has orders on Wednesday (5), Thursday (1), and Sunday (5+5=10).
  • Glasses has an order on Friday (5). Other days are 0.
  • T-Shirt has no orders, so all days are 0.
Category Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Book 20 5 0 0 10 0 0
Glasses 0 0 0 0 5 0 0
Phone 0 0 5 1 0 0 10
T-Shirt 0 0 0 0 0 0 0

Complexity Analysis

Measure Complexity Explanation
Time O(O + C) O for processing all orders, C for categories in pivot; single pass aggregation
Space O(C * D) Pivot table stores one row per category and seven columns per week

The Python solution uses a pivot table with pandas, which stores intermediate results in memory proportional to categories times days. The Go solution leverages the database for aggregation, so memory usage is minimal in the application layer.

Test Cases

# test cases
orders = [
    {'order_id':1, 'customer_id':1, 'order_date':'2020-06-01','item_id':'1','quantity':10},
    {'order_id':2, 'customer_id':1, 'order_date':'2020-06-08','item_id':'2','quantity':10},
    {'order_id':3, 'customer_id':2, 'order_date':'2020-06-02','item_id':'1','quantity':5},
    {'order_id':4, 'customer_id':3, 'order_date':'2020-06-03','item_id':'3','quantity':5},
    {'order_id':5, 'customer_id':4, 'order_date':'2020-06-04','item_id':'4','quantity':1},
    {'order_id':6, 'customer_id':4, '