LeetCode 1251 - Average Selling Price

This problem asks us to compute the average selling price for every product based on two tables: Prices and UnitsSold. T

LeetCode Problem 1251

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

This problem asks us to compute the average selling price for every product based on two tables: Prices and UnitsSold.

The Prices table tells us how much a product costs during a specific date range. Each row contains a product_id, a start_date, an end_date, and the price for that period. A product may have multiple pricing periods, but the problem guarantees that these periods do not overlap for the same product. This guarantee is important because it means any purchase date can match at most one price interval for a product.

The UnitsSold table records sales transactions. Each row tells us that a specific number of units of a product were sold on a particular purchase_date. Multiple rows for the same product and date are allowed.

The task is to calculate the average selling price for each product. The formula is:

$$\text{average price} = \frac{\text{total revenue}}{\text{total units sold}}$$

To calculate total revenue, we must determine which price applies to each sale by matching the purchase_date against the correct pricing interval in Prices. Once we know the correct price, we multiply:

$$\text{units} \times \text{price}$$

and sum these values across all sales of the product.

The problem also introduces an important edge case: products with no sales. If a product exists in Prices but has no corresponding sold units, its average selling price should be 0.

Finally, the answer must be rounded to two decimal places, and results can be returned in any order.

The constraints are relatively small because this is an Easy database problem, but efficiency still matters. Since each sale belongs to exactly one price period, we can efficiently combine the tables with a join operation instead of repeatedly searching through all price ranges.

Important edge cases include:

  • A product exists in Prices but has no matching row in UnitsSold. The result should be 0.00.
  • A product has multiple pricing periods, and sales occur across different ranges. We must correctly match each purchase date to its valid price interval.
  • Multiple sales happen on the same day or duplicate rows exist in UnitsSold. These should all contribute independently to the total.
  • Boundary dates matter because purchase_date can equal start_date or end_date. The comparison must be inclusive.

Approaches

Brute Force Approach

A straightforward solution would iterate through every product sale and, for each sale, scan every pricing interval of that product until the correct date range is found.

For each row in UnitsSold, we would:

  1. Find all price records for that product.
  2. Check each interval to determine whether purchase_date falls between start_date and end_date.
  3. Multiply units × price.
  4. Accumulate total revenue and total units for each product.

After processing all sales, we divide total revenue by total units and round to two decimal places.

This method is correct because every sale will eventually find the matching pricing interval, and the problem guarantees intervals do not overlap. However, it becomes inefficient because every sale may require scanning many pricing periods.

Optimal Approach

The key insight is that this is fundamentally a join problem.

Each sale already contains a product_id and purchase_date, while each pricing record defines a valid date interval for a product. Since pricing periods do not overlap, we can directly join the two tables on:

  • Matching product_id
  • purchase_date BETWEEN start_date AND end_date

Once joined, each sale receives its correct price automatically.

After joining:

  • Compute total revenue using units * price
  • Compute total sold units using SUM(units)
  • Divide revenue by units
  • Use ROUND(..., 2) for formatting

To ensure products with no sales still appear, we use a LEFT JOIN from Prices to UnitsSold.

Approach Time Complexity Space Complexity Notes
Brute Force O(P × U) O(P) For each sale, scan price intervals
Optimal O(P + U) (logical SQL execution) O(P) Join matching dates and aggregate efficiently

Here:

  • P = number of rows in Prices
  • U = number of rows in UnitsSold

In practice, database indexing and query optimization make the join solution efficient.

Algorithm Walkthrough

  1. Start from the Prices table because every product must appear in the output, even if it has no sales. This ensures products without purchases are not accidentally excluded.
  2. Perform a LEFT JOIN with UnitsSold using two conditions:
  • product_id must match.
  • purchase_date must fall inside the price interval:

purchase_date BETWEEN start_date AND end_date.

This guarantees every sale is paired with the correct price. 3. Group rows by product_id. Since a product may have multiple pricing intervals and multiple sales, grouping lets us aggregate all matching records together. 4. Compute total revenue using:

SUM(price * units)

This gives the total money earned for the product. 5. Compute total units sold using:

SUM(units)
  1. Handle products with no sales. Since LEFT JOIN may produce NULL values, division would fail or return NULL. We use:
IFNULL(...)

or equivalent logic to return 0. 7. Divide total revenue by total units and round the result to two decimal places.

Why it works

The algorithm works because every sale belongs to exactly one pricing interval for its product. The problem guarantees there are no overlapping date ranges, so the join produces exactly one valid price per sale. Summing units × price gives total revenue, summing units gives total quantity sold, and dividing them produces the weighted average selling price.

Python Solution

# Write your MySQL query statement below

SELECT
    p.product_id,
    ROUND(
        IFNULL(SUM(p.price * u.units) / SUM(u.units), 0),
        2
    ) AS average_price
FROM Prices p
LEFT JOIN UnitsSold u
    ON p.product_id = u.product_id
    AND u.purchase_date BETWEEN p.start_date AND p.end_date
GROUP BY p.product_id;

This implementation begins with the Prices table so that every product appears in the final result. The LEFT JOIN connects sales to their valid pricing intervals using both product_id and the date range condition.

The aggregation step calculates two quantities for each product:

  • SUM(p.price * u.units) computes total revenue.
  • SUM(u.units) computes total sold units.

Since products without sales produce NULL values during aggregation, IFNULL(..., 0) converts the result to zero. Finally, ROUND(..., 2) ensures the answer matches the required precision.

Go Solution

// Write your MySQL query statement below

SELECT
    p.product_id,
    ROUND(
        IFNULL(SUM(p.price * u.units) / SUM(u.units), 0),
        2
    ) AS average_price
FROM Prices p
LEFT JOIN UnitsSold u
    ON p.product_id = u.product_id
    AND u.purchase_date BETWEEN p.start_date AND p.end_date
GROUP BY p.product_id;

Unlike algorithmic LeetCode problems, database problems use SQL directly regardless of the selected language. Therefore, the Go version is identical to the Python version because LeetCode expects a SQL query submission rather than executable Go code.

Worked Examples

Example 1

Input

Prices

product_id start_date end_date price
1 2019-02-17 2019-02-28 5
1 2019-03-01 2019-03-22 20
2 2019-02-01 2019-02-20 15
2 2019-02-21 2019-03-31 30

UnitsSold

product_id purchase_date units
1 2019-02-25 100
1 2019-03-01 15
2 2019-02-10 200
2 2019-03-22 30

Step 1: Perform Join

product_id purchase_date units matched price
1 2019-02-25 100 5
1 2019-03-01 15 20
2 2019-02-10 200 15
2 2019-03-22 30 30

Step 2: Calculate Revenue

product_id calculation revenue
1 100 × 5 500
1 15 × 20 300
2 200 × 15 3000
2 30 × 30 900

Step 3: Aggregate Per Product

product_id total revenue total units average
1 800 115 6.96
2 3900 230 16.96

Final output:

product_id average_price
1 6.96
2 16.96

Complexity Analysis

Measure Complexity Explanation
Time O(P + U) Database join and aggregation process all rows
Space O(P) Aggregation storage per product

The database engine performs a join between pricing periods and sales records, followed by grouping by product_id. Since each sale matches at most one pricing interval, the amount of intermediate work stays manageable. The grouping phase stores aggregate values for each product.

Test Cases

# Example 1: multiple price ranges
# Product 1:
# ((100 * 5) + (15 * 20)) / 115 = 6.96
# Product 2:
# ((200 * 15) + (30 * 30)) / 230 = 16.96

# Product with no sales
# Prices:
# product_id=1, price=10
# UnitsSold empty
# Expected average = 0.00

# Single product, single sale
# price=25, units=4
# average = 25.00

# Multiple duplicate sales
# Same purchase date repeated
# Ensure duplicates are counted independently

# Boundary date matching
# purchase_date equals start_date or end_date
# Must still match because BETWEEN is inclusive

# Multiple pricing intervals
# Ensure sales match correct price period
Test Why
Example 1 Validates standard weighted average calculation
Product with no sales Ensures result becomes 0.00
Single sale Tests simplest valid case
Duplicate rows Confirms repeated transactions are counted
Boundary dates Verifies inclusive interval matching
Multiple price intervals Ensures correct date-range join

Edge Cases

Product With No Sales

A common mistake is accidentally excluding products that never appear in UnitsSold. If we use an INNER JOIN, those products disappear from the result entirely. Using a LEFT JOIN ensures every product remains in the output, and IFNULL(..., 0) correctly converts missing averages into 0.00.

Sales on Boundary Dates

A purchase may happen exactly on start_date or end_date. Using strict inequalities (> or <) would incorrectly miss these sales. The implementation uses BETWEEN, which is inclusive and correctly handles boundary dates.

Multiple Price Intervals for the Same Product

A product may have different prices during different periods. A naive implementation might apply the wrong price if it ignores dates. The join condition explicitly checks whether the purchase date falls inside the valid range, ensuring every sale gets the correct price.

Duplicate Rows in UnitsSold

The problem explicitly states that duplicate rows may exist. A buggy solution might accidentally deduplicate them. Since aggregation uses SUM(units) and SUM(price * units) directly over joined rows, duplicates naturally contribute multiple times, which is the correct behavior.