LeetCode 2985 - Calculate Compressed Mean
This problem asks us to compute the average number of items per order from a compressed representation of order data. Instead of storing every individual order as a separate row, the table groups together orders that contain the same number of items.
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
This problem asks us to compute the average number of items per order from a compressed representation of order data.
Instead of storing every individual order as a separate row, the table groups together orders that contain the same number of items. For each row:
item_countrepresents how many items are in a particular type of order.order_occurrencesrepresents how many times that order occurred.order_idis simply a unique identifier for the row.
For example, if a row contains:
| item_count | order_occurrences |
|---|---|
| 3 | 800 |
it means there are 800 orders that each contain exactly 3 items.
To calculate the overall average number of items per order, we cannot simply average the item_count values because each value occurs a different number of times. Instead, we must compute a weighted average:
$$\text{Average} = \frac{\sum(\text{item_count} \times \text{order_occurrences})} {\sum(\text{order_occurrences})}$$
The numerator represents the total number of items across all orders, while the denominator represents the total number of orders.
The final answer must be rounded to exactly two decimal places and returned as a single-row result table with the column name:
average_items_per_order
Since this is a database problem, the goal is to write an SQL query rather than implement an algorithm in a programming language.
The input guarantees that each row is unique by order_id. The table already contains all information necessary to compute the weighted average directly.
Important edge cases include situations where:
- Only one row exists in the table.
- All orders have the same
item_count. - Very large occurrence counts appear, requiring aggregation rather than expansion into individual orders.
- Different rows have drastically different frequencies, making a simple arithmetic average incorrect.
Approaches
Brute Force Approach
A conceptual brute-force solution would expand the compressed data into individual orders.
For example:
| item_count | order_occurrences |
|---|---|
| 2 | 3 |
would become:
2, 2, 2
After expanding every row, we could sum all values and divide by the total number of expanded orders.
This approach is correct because it reconstructs the original dataset exactly. However, it is extremely inefficient. If a row has millions of occurrences, we would need to generate millions of records unnecessarily.
Optimal Approach
The key observation is that we never need to reconstruct individual orders.
Each row already tells us both:
- How many items each order contains.
- How many such orders exist.
Therefore, we can directly compute:
- Total items =
SUM(item_count * order_occurrences) - Total orders =
SUM(order_occurrences)
The desired average is simply:
total_items / total_orders
Since the problem requires two decimal places, we apply the SQL ROUND(..., 2) function.
Comparison of Approaches
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(Total Orders) | O(Total Orders) | Expands every occurrence into separate records |
| Optimal | O(n) | O(1) | Uses weighted-average aggregation directly |
Here, n is the number of rows in the Orders table.
Algorithm Walkthrough
Step 1
Compute the total number of items across all orders.
For each row, multiply:
item_count * order_occurrences
Then sum these products across the entire table.
This gives the total number of items represented by the compressed data.
Step 2
Compute the total number of orders.
Add together all values of:
order_occurrences
This gives the total count of orders.
Step 3
Divide the total items by the total orders.
This produces the weighted average number of items per order.
Step 4
Round the result to two decimal places using SQL's ROUND function.
Step 5
Return the result using the required column name:
average_items_per_order
Why it works
The compressed table is simply a frequency representation of the original orders. Multiplying item_count by order_occurrences reconstructs the contribution of that group to the total item count. Summing these contributions gives the exact total number of items. Dividing by the total number of orders therefore computes the same average that would be obtained from the fully expanded dataset, without ever expanding it.
SQL Solution
SELECT
ROUND(
SUM(item_count * order_occurrences) /
SUM(order_occurrences),
2
) AS average_items_per_order
FROM Orders;
The query first computes the weighted sum of items using:
SUM(item_count * order_occurrences)
It then computes the total number of orders using:
SUM(order_occurrences)
Dividing these two values yields the weighted average. Finally, ROUND(..., 2) ensures the output contains exactly two decimal places, matching the problem requirements.
Worked Example
Consider the input:
| order_id | item_count | order_occurrences |
|---|---|---|
| 10 | 1 | 500 |
| 11 | 2 | 1000 |
| 12 | 3 | 800 |
| 13 | 4 | 1000 |
Calculate Total Items
| item_count | order_occurrences | Contribution |
|---|---|---|
| 1 | 500 | 500 |
| 2 | 1000 | 2000 |
| 3 | 800 | 2400 |
| 4 | 1000 | 4000 |
Total items:
| Calculation | Value |
|---|---|
| 500 + 2000 + 2400 + 4000 | 8900 |
Calculate Total Orders
| order_occurrences |
|---|
| 500 |
| 1000 |
| 800 |
| 1000 |
Total orders:
| Calculation | Value |
|---|---|
| 500 + 1000 + 800 + 1000 | 3300 |
Calculate Average
| Formula | Result |
|---|---|
| 8900 / 3300 | 2.696969... |
Rounded to two decimal places:
| average_items_per_order |
|---|
| 2.70 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each row is processed once by the aggregate functions |
| Space | O(1) | Only aggregate values are maintained |
The database engine performs a single scan of the table to compute both sums. No auxiliary data structures or expanded records are required, making the solution both time and space efficient.
Test Cases
The following sample datasets help validate correctness.
-- Example 1
INSERT INTO Orders VALUES
(10, 1, 500),
(11, 2, 1000),
(12, 3, 800),
(13, 4, 1000);
-- Expected: 2.70
-- Single row
INSERT INTO Orders VALUES
(1, 5, 100);
-- Expected: 5.00
-- All orders identical
INSERT INTO Orders VALUES
(1, 3, 50),
(2, 3, 75),
(3, 3, 100);
-- Expected: 3.00
-- Heavily weighted larger value
INSERT INTO Orders VALUES
(1, 1, 1),
(2, 10, 999);
-- Expected: 9.99
-- Equal frequencies
INSERT INTO Orders VALUES
(1, 1, 10),
(2, 2, 10),
(3, 3, 10);
-- Expected: 2.00
Test Summary
| Test | Why |
|---|---|
| Problem example | Verifies weighted-average computation |
| Single row | Ensures trivial case works correctly |
| All orders identical | Average should equal that item count |
| Strongly skewed frequencies | Verifies weighting is applied correctly |
| Equal frequencies | Reduces to a standard arithmetic average |
Edge Cases
Only One Distinct Order Type
If the table contains a single row, the average should simply equal that row's item_count. The formula naturally handles this because both the numerator and denominator are scaled by the same occurrence count.
Extremely Large Occurrence Counts
A naive expansion-based solution could require enormous memory if a row represents millions of orders. The aggregation approach avoids this entirely by working directly with counts and sums.
Uneven Frequencies
One of the most common mistakes is computing:
AVG(item_count)
This ignores how often each order type occurs. For example, if an order with 10 items appears 1000 times and an order with 1 item appears once, a simple average would be completely wrong. The weighted-average formula correctly accounts for frequency.
All Orders Have the Same Item Count
If every row has the same item_count, the answer should equal that value regardless of the occurrence counts. Since every term in the numerator is proportional to the same item count, the division simplifies exactly to that value.