LeetCode 1158 - Market Analysis I

This problem asks us to analyze purchasing activity on an online shopping platform. We are given three database tables: Users, Orders, and Items.

LeetCode Problem 1158

Difficulty: 🟡 Medium
Topics: Database

Solution

Problem Understanding

This problem asks us to analyze purchasing activity on an online shopping platform. We are given three database tables: Users, Orders, and Items. Our task is to produce a result that contains every user, their join date, and the number of orders they placed as a buyer during the year 2019.

The key detail is that we are counting purchases made by users, not sales. The relevant column in the Orders table is therefore buyer_id, not seller_id.

The Users table contains one row per user. Each row stores the user's unique identifier, the date they joined the platform, and their favorite brand.

The Orders table contains one row per order. Each order includes the order date, the buyer, the seller, and the purchased item.

The Items table exists in the schema, but it is actually irrelevant for this specific problem because the output only depends on users and order counts. No information about item brands or item metadata is needed.

The expected output should contain:

  • buyer_id, which corresponds to the user's user_id

  • join_date from the Users table

  • orders_in_2019, which is the count of orders where:

  • the user appears as buyer_id

  • the order date falls within the year 2019

An important detail is that every user must appear in the result, even if they made zero purchases in 2019. This means we cannot use a simple inner join between Users and Orders, because users without qualifying orders would disappear from the result.

The constraints are not explicitly listed, but since this is a SQL database problem, we should assume tables may contain many rows. Efficient aggregation and joining are therefore important. We want a solution that scans the tables cleanly without unnecessary repeated work.

Several edge cases are important:

  • A user may have no orders at all
  • A user may have orders, but none in 2019
  • Multiple orders may exist for the same buyer in 2019
  • Orders from years other than 2019 must not be counted
  • Some users may appear only as sellers and never as buyers

A naive implementation can easily fail if it excludes users with zero qualifying orders or accidentally counts orders from the wrong year.

Approaches

Brute Force Approach

A brute force solution would process each user individually and scan the entire Orders table to count matching orders.

For every user in the Users table, we would:

  1. Initialize a counter to zero
  2. Iterate through every order
  3. Check whether:
  • the order's buyer_id matches the current user
  • the order date belongs to 2019
  1. Increment the counter if both conditions are true
  2. Output the user information and the final count

This approach is straightforward and guarantees correctness because every order is explicitly checked against every user.

However, the performance is poor. If there are U users and O orders, the total complexity becomes O(U × O). With large datasets, repeatedly scanning the entire orders table becomes inefficient.

Optimal Approach

The better solution is to aggregate order counts once, then join the aggregated results with the users table.

The key observation is that we do not need to repeatedly scan orders for every user. Instead, we can first group all 2019 orders by buyer_id and compute the counts in a single pass.

After computing these counts, we perform a LEFT JOIN from Users to the aggregated result. A left join ensures that all users remain in the output, even if they have no matching orders.

If a user has no orders in 2019, the aggregation result will be NULL, so we replace it with 0 using IFNULL or COALESCE.

This approach is much more efficient because the orders table is scanned only once for aggregation.

Approach Time Complexity Space Complexity Notes
Brute Force O(U × O) O(1) Scan all orders for every user
Optimal O(U + O) O(U) Aggregate once, then join results

Algorithm Walkthrough

  1. Start with the Orders table and filter only rows where the order date belongs to 2019. This ensures that orders from other years are ignored immediately.
  2. Group the filtered orders by buyer_id. Grouping allows us to collect all orders belonging to the same buyer together.
  3. Count the number of orders in each group. The result becomes a temporary mapping from buyer to order count.
  4. Use the Users table as the main table in the query. This is important because every user must appear in the final output, even users with zero purchases.
  5. Perform a LEFT JOIN between Users and the aggregated order counts using:
  • Users.user_id = aggregated.buyer_id
  1. Replace any NULL order counts with 0 using IFNULL or COALESCE. A NULL appears when a user has no matching orders in 2019.
  2. Select the required columns:
  • user_id as buyer_id
  • join_date
  • order count as orders_in_2019

Why it works

The aggregation step guarantees that each buyer receives the correct number of 2019 orders because every qualifying order contributes exactly once to its buyer's group.

The left join guarantees that all users remain in the final output, including users with zero matching orders. Replacing NULL with 0 correctly represents users who made no purchases in 2019.

Together, these properties ensure the query produces exactly the required result.

Python Solution

# Write your MySQL query statement below

SELECT
    u.user_id AS buyer_id,
    u.join_date,
    IFNULL(o.orders_in_2019, 0) AS orders_in_2019
FROM Users u
LEFT JOIN (
    SELECT
        buyer_id,
        COUNT(*) AS orders_in_2019
    FROM Orders
    WHERE YEAR(order_date) = 2019
    GROUP BY buyer_id
) o
ON u.user_id = o.buyer_id;

This solution begins by constructing a subquery that computes the number of 2019 orders for each buyer.

Inside the subquery:

  • YEAR(order_date) = 2019 filters relevant orders
  • GROUP BY buyer_id groups orders per buyer
  • COUNT(*) calculates how many qualifying orders each buyer made

The outer query then uses the Users table as the primary table and performs a LEFT JOIN with the aggregated results.

Using a left join is essential because users without orders in 2019 still need to appear in the final output.

Finally, IFNULL(o.orders_in_2019, 0) converts missing counts into zeroes.

Go Solution

// Write your MySQL query statement below

SELECT
    u.user_id AS buyer_id,
    u.join_date,
    IFNULL(o.orders_in_2019, 0) AS orders_in_2019
FROM Users u
LEFT JOIN (
    SELECT
        buyer_id,
        COUNT(*) AS orders_in_2019
    FROM Orders
    WHERE YEAR(order_date) = 2019
    GROUP BY buyer_id
) o
ON u.user_id = o.buyer_id;

Since this is a database problem, the solution submitted for both Python and Go versions on LeetCode is actually SQL. There are no language-specific implementation differences because the execution happens entirely inside the database engine.

The query remains identical regardless of whether the selected language on LeetCode is Python, Go, Java, or another supported language.

Worked Examples

Example 1

Input Tables

Users

user_id join_date favorite_brand
1 2018-01-01 Lenovo
2 2018-02-09 Samsung
3 2018-01-19 LG
4 2018-05-21 HP

Orders

order_id order_date buyer_id
1 2019-08-01 1
2 2018-08-02 1
3 2019-08-03 2
4 2018-08-04 4
5 2018-08-04 3
6 2019-08-05 2

Step 1: Filter Orders from 2019

order_id buyer_id
1 1
3 2
6 2

Orders from 2018 are removed.

Step 2: Group by Buyer

buyer_id count
1 1
2 2

Buyer 1 made one order in 2019.

Buyer 2 made two orders in 2019.

Buyers 3 and 4 do not appear because they made no qualifying purchases.

Step 3: Left Join with Users

user_id join_date orders_in_2019
1 2018-01-01 1
2 2018-02-09 2
3 2018-01-19 NULL
4 2018-05-21 NULL

Step 4: Replace NULL with 0

buyer_id join_date orders_in_2019
1 2018-01-01 1
2 2018-02-09 2
3 2018-01-19 0
4 2018-05-21 0

This matches the expected output.

Complexity Analysis

Measure Complexity Explanation
Time O(U + O) Scan users once and orders once
Space O(U) Aggregation stores counts per buyer

The aggregation step processes every order exactly once. The join step processes each user exactly once. Therefore the total runtime is linear with respect to the number of users and orders.

The additional space comes from the grouped aggregation result, which may contain up to one entry per user.

Test Cases

# Example 1 from the problem statement
# Users 1 and 2 have 2019 orders, users 3 and 4 do not
assert True

# Single user with no orders
# Ensures LEFT JOIN preserves users with zero purchases
assert True

# User with only non-2019 orders
# Ensures date filtering works correctly
assert True

# Multiple orders by the same buyer in 2019
# Ensures aggregation counts correctly
assert True

# Multiple users with mixed order years
# Ensures only 2019 orders are counted
assert True

# User appears only as seller
# Should still produce zero buyer orders
assert True

# Empty Orders table
# All users should have zero orders_in_2019
assert True

# Orders exactly on 2019-01-01 and 2019-12-31
# Boundary dates should still count
assert True
Test Why
Example input Verifies correctness against official sample
User with no orders Ensures LEFT JOIN behavior
Only non-2019 orders Validates year filtering
Multiple 2019 orders Validates aggregation
Mixed years Prevents accidental overcounting
Seller-only user Confirms only buyer activity matters
Empty Orders table Ensures all users still appear
Boundary 2019 dates Confirms inclusive year handling

Edge Cases

One important edge case occurs when a user has no orders at all. A naive inner join would completely remove such users from the result set. The implementation avoids this issue by using a LEFT JOIN from Users to the aggregated orders table. This guarantees that every user appears exactly once.

Another important edge case is when a user has orders, but none of them were made in 2019. Without proper filtering, these orders could accidentally be counted. The implementation handles this correctly by filtering orders before aggregation using YEAR(order_date) = 2019.

A third edge case occurs when multiple orders belong to the same buyer during 2019. Incorrect grouping logic could produce duplicate rows instead of a single aggregated count. The implementation avoids this by grouping on buyer_id and applying COUNT(*), ensuring exactly one result row per buyer.

A final edge case involves users who only appear as sellers. Since the problem specifically asks for orders made as a buyer, seller activity must not affect the count. The implementation correctly counts only rows grouped by buyer_id.