LeetCode 1511 - Customer Order Frequency

This problem asks us to identify customers who consistently spend at least $100 in each of two consecutive months, June and July of 2020.

LeetCode Problem 1511

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

This problem asks us to identify customers who consistently spend at least $100 in each of two consecutive months, June and July of 2020. We have three tables: Customers, which lists customer details; Product, which lists products and their prices; and Orders, which logs all purchases by customers with the quantity and date.

The input represents transactional data, and the output should be a list of customer_id and name for all customers meeting the criterion. Importantly, the problem only concerns orders in June and July 2020. A customer must reach a cumulative spending of at least $100 in both months individually. Customers who spend enough in one month but not the other should not be included.

Edge cases include customers who did not place any orders in one of the months, customers who spent exactly $100, and customers who had multiple orders across the months that sum to just above or below $100. The input is assumed to be moderate in size since it involves SQL database tables, and no explicit constraint on the number of orders or products is given.

Approaches

The brute-force approach would involve calculating the total spending per customer for each month separately, then checking if both totals meet the $100 threshold. This could be done using multiple subqueries or by joining the Orders and Product tables and aggregating per month. While correct, this approach can be verbose and less efficient, as it requires repeated scanning of orders.

The optimal approach leverages SQL aggregation and conditional filtering. By joining Orders with Product to compute the spending per order (quantity * price) and grouping by customer_id and month, we can directly filter customers who meet the threshold for each month. Finally, counting qualifying months ensures that only customers who meet the requirement in both months are returned. This approach reduces unnecessary scans and avoids multiple subqueries.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(n) Aggregate orders per month separately, then join results
Optimal O(n) O(n) Single aggregation with conditional filtering and HAVING clause

Algorithm Walkthrough

  1. Join the Orders table with the Product table on product_id to calculate spending per order as quantity * price. This allows us to compute total expenditure for each order directly.
  2. Filter orders to include only those with order_date in June or July 2020. We are only concerned with these two months.
  3. Group the results by customer_id and the month of the order. For each group, calculate the sum of spending.
  4. Use a HAVING clause to keep only those month groups where the total spending is at least $100. This ensures we only consider qualifying months for each customer.
  5. For each customer, count the number of qualifying months. Only keep customers where the count equals 2, meaning the customer met the $100 threshold in both June and July.
  6. Join the resulting customer IDs with the Customers table to retrieve the customer name.
  7. Return the final result set with customer_id and name.

Why it works: By grouping orders per customer per month and filtering on total spending, the algorithm guarantees that only customers who meet the monthly spending threshold in both months are selected. Counting qualifying months ensures that no month is missed.

Python Solution

# This is a SQL problem; the Python solution is represented as an executable SQL query.

sql = """
SELECT c.customer_id, c.name
FROM Customers c
JOIN (
    SELECT customer_id
    FROM Orders o
    JOIN Product p ON o.product_id = p.product_id
    WHERE o.order_date BETWEEN '2020-06-01' AND '2020-07-31'
    GROUP BY customer_id, MONTH(o.order_date)
    HAVING SUM(o.quantity * p.price) >= 100
) AS qualified_orders
ON c.customer_id = qualified_orders.customer_id
GROUP BY c.customer_id, c.name
HAVING COUNT(DISTINCT MONTH(qualified_orders.order_date)) = 2;
"""

Explanation: The query first joins Orders and Product to compute spending per order. Filtering to June and July 2020 ensures irrelevant orders are ignored. Aggregating by customer_id and month allows summing each customer's spending per month. The HAVING SUM(...) >= 100 keeps only those months with sufficient spending. Finally, counting distinct months and keeping only customers with both months meeting the threshold gives the desired result.

Go Solution

// As this is a SQL problem, the Go solution executes SQL queries using a database connection
package main

import (
    "database/sql"
    _ "github.com/go-sql-driver/mysql"
    "fmt"
)

func CustomerOrderFrequency(db *sql.DB) {
    query := `
    SELECT c.customer_id, c.name
    FROM Customers c
    JOIN (
        SELECT customer_id, MONTH(order_date) AS order_month
        FROM Orders o
        JOIN Product p ON o.product_id = p.product_id
        WHERE o.order_date BETWEEN '2020-06-01' AND '2020-07-31'
        GROUP BY customer_id, MONTH(o.order_date)
        HAVING SUM(o.quantity * p.price) >= 100
    ) AS qualified_orders
    ON c.customer_id = qualified_orders.customer_id
    GROUP BY c.customer_id, c.name
    HAVING COUNT(DISTINCT qualified_orders.order_month) = 2;
    `
    rows, err := db.Query(query)
    if err != nil {
        panic(err)
    }
    defer rows.Close()

    for rows.Next() {
        var id int
        var name string
        rows.Scan(&id, &name)
        fmt.Println(id, name)
    }
}

Go-specific notes: The query execution uses the standard database/sql package. rows.Scan retrieves customer_id and name for each qualified customer. No additional Go data structures are needed because the query handles all aggregation and filtering.

Worked Examples

For the example provided:

  • Customer 1 (Winston) has orders in June and July. Spending is $300 in June and $100 in July, both above the $100 threshold. He qualifies.
  • Customer 2 (Jonathan) spends $600 in June and $20 in July. July is below $100, so he does not qualify.
  • Customer 3 (Moustafa) spends $110 in June and $0 in July. July is below $100, so he does not qualify.

State of relevant variables during grouping:

customer_id month total_spending
1 6 300
1 7 100
2 6 600
2 7 20
3 6 110
3 7 0

Counting qualifying months (total_spending >= 100):

customer_id qualifying_months
1 2
2 1
3 1

Only Winston (customer_id 1) has 2 qualifying months.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each order is processed once during the join and aggregation
Space O(n) Storing intermediate aggregates per customer per month

The time complexity is linear with respect to the number of orders because each order is considered once. Space complexity is proportional to the number of unique customers multiplied by months, which is small relative to the total number of orders.

Test Cases

# Test case: example provided
# Returns Winston only
assert sql_query_result() == [(1, "Winston")]

# Test case: customer with exactly $100 in both months
# Should be included
assert sql_query_result(customers=[(4, "Alice", "UK")], 
                        products=[(50, "Gadget", 50)],
                        orders=[(10, 4, 50, '2020-06-15', 2),
                                (11, 4, 50, '2020-07-10', 2)]) == [(4, "Alice")]

# Test case: customer spending only in June
# Should not be included
assert sql_query_result(customers=[(5, "Bob", "FR")],
                        products=[(60, "Widget", 100)],
                        orders=[(12, 5, 60, '2020-06-20', 1)]) == []

# Test case: customer spending only in July
# Should not be included
assert sql_query_result(customers=[(6, "Charlie", "DE")],
                        products=[(70, "Thing", 200)],
                        orders=[(13, 6, 70, '2020-07-05', 1)]) == []

# Test case: customer spending below $100 in one month
# Should not be included
assert sql_query_result(customers=[(7, "Diana", "US")],
                        products=[(80, "Item", 60)],
                        orders=[(14, 7, 80, '2020-06-15', 1),
                                (15, 7, 80, '2020-07-10', 1)]) == []
Test Why
Example Validates correct computation with multiple customers and orders
Exactly $100 Edge case for threshold