LeetCode 2985 - Calculate Compressed Mean

This problem asks us to compute the average number of items per order from a compressed representation of order data. Instead of storing every individual order as a separate row, the table groups together orders that contain the same number of items.

LeetCode Problem 2985

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

This problem asks us to compute the average number of items per order from a compressed representation of order data.

Instead of storing every individual order as a separate row, the table groups together orders that contain the same number of items. For each row:

  • item_count represents how many items are in a particular type of order.
  • order_occurrences represents how many times that order occurred.
  • order_id is simply a unique identifier for the row.

For example, if a row contains:

item_count order_occurrences
3 800

it means there are 800 orders that each contain exactly 3 items.

To calculate the overall average number of items per order, we cannot simply average the item_count values because each value occurs a different number of times. Instead, we must compute a weighted average:

$$\text{Average} = \frac{\sum(\text{item_count} \times \text{order_occurrences})} {\sum(\text{order_occurrences})}$$

The numerator represents the total number of items across all orders, while the denominator represents the total number of orders.

The final answer must be rounded to exactly two decimal places and returned as a single-row result table with the column name:

average_items_per_order

Since this is a database problem, the goal is to write an SQL query rather than implement an algorithm in a programming language.

The input guarantees that each row is unique by order_id. The table already contains all information necessary to compute the weighted average directly.

Important edge cases include situations where:

  • Only one row exists in the table.
  • All orders have the same item_count.
  • Very large occurrence counts appear, requiring aggregation rather than expansion into individual orders.
  • Different rows have drastically different frequencies, making a simple arithmetic average incorrect.

Approaches

Brute Force Approach

A conceptual brute-force solution would expand the compressed data into individual orders.

For example:

item_count order_occurrences
2 3

would become:

2, 2, 2

After expanding every row, we could sum all values and divide by the total number of expanded orders.

This approach is correct because it reconstructs the original dataset exactly. However, it is extremely inefficient. If a row has millions of occurrences, we would need to generate millions of records unnecessarily.

Optimal Approach

The key observation is that we never need to reconstruct individual orders.

Each row already tells us both:

  • How many items each order contains.
  • How many such orders exist.

Therefore, we can directly compute:

  • Total items = SUM(item_count * order_occurrences)
  • Total orders = SUM(order_occurrences)

The desired average is simply:

total_items / total_orders

Since the problem requires two decimal places, we apply the SQL ROUND(..., 2) function.

Comparison of Approaches

Approach Time Complexity Space Complexity Notes
Brute Force O(Total Orders) O(Total Orders) Expands every occurrence into separate records
Optimal O(n) O(1) Uses weighted-average aggregation directly

Here, n is the number of rows in the Orders table.

Algorithm Walkthrough

Step 1

Compute the total number of items across all orders.

For each row, multiply:

item_count * order_occurrences

Then sum these products across the entire table.

This gives the total number of items represented by the compressed data.

Step 2

Compute the total number of orders.

Add together all values of:

order_occurrences

This gives the total count of orders.

Step 3

Divide the total items by the total orders.

This produces the weighted average number of items per order.

Step 4

Round the result to two decimal places using SQL's ROUND function.

Step 5

Return the result using the required column name:

average_items_per_order

Why it works

The compressed table is simply a frequency representation of the original orders. Multiplying item_count by order_occurrences reconstructs the contribution of that group to the total item count. Summing these contributions gives the exact total number of items. Dividing by the total number of orders therefore computes the same average that would be obtained from the fully expanded dataset, without ever expanding it.

SQL Solution

SELECT
    ROUND(
        SUM(item_count * order_occurrences) /
        SUM(order_occurrences),
        2
    ) AS average_items_per_order
FROM Orders;

The query first computes the weighted sum of items using:

SUM(item_count * order_occurrences)

It then computes the total number of orders using:

SUM(order_occurrences)

Dividing these two values yields the weighted average. Finally, ROUND(..., 2) ensures the output contains exactly two decimal places, matching the problem requirements.

Worked Example

Consider the input:

order_id item_count order_occurrences
10 1 500
11 2 1000
12 3 800
13 4 1000

Calculate Total Items

item_count order_occurrences Contribution
1 500 500
2 1000 2000
3 800 2400
4 1000 4000

Total items:

Calculation Value
500 + 2000 + 2400 + 4000 8900

Calculate Total Orders

order_occurrences
500
1000
800
1000

Total orders:

Calculation Value
500 + 1000 + 800 + 1000 3300

Calculate Average

Formula Result
8900 / 3300 2.696969...

Rounded to two decimal places:

average_items_per_order
2.70

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each row is processed once by the aggregate functions
Space O(1) Only aggregate values are maintained

The database engine performs a single scan of the table to compute both sums. No auxiliary data structures or expanded records are required, making the solution both time and space efficient.

Test Cases

The following sample datasets help validate correctness.

-- Example 1
INSERT INTO Orders VALUES
(10, 1, 500),
(11, 2, 1000),
(12, 3, 800),
(13, 4, 1000);
-- Expected: 2.70

-- Single row
INSERT INTO Orders VALUES
(1, 5, 100);
-- Expected: 5.00

-- All orders identical
INSERT INTO Orders VALUES
(1, 3, 50),
(2, 3, 75),
(3, 3, 100);
-- Expected: 3.00

-- Heavily weighted larger value
INSERT INTO Orders VALUES
(1, 1, 1),
(2, 10, 999);
-- Expected: 9.99

-- Equal frequencies
INSERT INTO Orders VALUES
(1, 1, 10),
(2, 2, 10),
(3, 3, 10);
-- Expected: 2.00

Test Summary

Test Why
Problem example Verifies weighted-average computation
Single row Ensures trivial case works correctly
All orders identical Average should equal that item count
Strongly skewed frequencies Verifies weighting is applied correctly
Equal frequencies Reduces to a standard arithmetic average

Edge Cases

Only One Distinct Order Type

If the table contains a single row, the average should simply equal that row's item_count. The formula naturally handles this because both the numerator and denominator are scaled by the same occurrence count.

Extremely Large Occurrence Counts

A naive expansion-based solution could require enormous memory if a row represents millions of orders. The aggregation approach avoids this entirely by working directly with counts and sums.

Uneven Frequencies

One of the most common mistakes is computing:

AVG(item_count)

This ignores how often each order type occurs. For example, if an order with 10 items appears 1000 times and an order with 1 item appears once, a simple average would be completely wrong. The weighted-average formula correctly accounts for frequency.

All Orders Have the Same Item Count

If every row has the same item_count, the answer should equal that value regardless of the occurrence counts. Since every term in the numerator is proportional to the same item count, the division simplifies exactly to that value.