LeetCode 1082 - Sales Analysis I

This problem asks us to identify the seller or sellers who generated the highest total sales revenue. We are given two database tables: Product and Sales. The Product table contains information about products, including their IDs, names, and unit prices.

LeetCode Problem 1082

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

This problem asks us to identify the seller or sellers who generated the highest total sales revenue.

We are given two database tables: Product and Sales. The Product table contains information about products, including their IDs, names, and unit prices. The Sales table records transactions, where each row represents a sale and includes details such as the seller ID, buyer ID, quantity, and the total price for that sale.

The important detail is that we are not asked to calculate revenue using quantity * unit_price from the Product table. Instead, the problem explicitly provides a price column in the Sales table, which already represents the total amount for that transaction. Therefore, the task is to compute the sum of price values for each seller.

Once we know the total sales revenue for every seller, we must determine which seller has the maximum total revenue. If multiple sellers tie for the highest value, we return all of them.

The output should contain a single column:

Column Meaning
seller_id Seller(s) with the highest total sales revenue

The result can be returned in any order, which means sorting is unnecessary.

Since this is a database problem with an "Easy" difficulty rating, the main challenge is expressing the aggregation logic correctly in SQL. We need to group sales by seller, compute total revenue, determine the maximum revenue, and then filter sellers who match that maximum.

An important observation is that the Product table is actually unnecessary for solving the problem. Although product_id exists as a foreign key, we never need product names or unit prices because the sales amount is already available in the Sales.price column.

Important Edge Cases

A naive implementation may overlook several important scenarios.

If multiple sellers have the same highest total revenue, all of them must be returned. This means solutions that only select one maximum row will fail.

A seller may appear across multiple sales rows. We must aggregate all rows for that seller rather than considering individual transactions independently.

The Sales table may contain repeated rows, which means duplicate transactions should still contribute to the total revenue. We should not deduplicate records.

The problem guarantees that seller_id values exist in the Sales table and that the schema is valid, so we do not need to handle malformed input.

Approaches

Brute Force Approach

The brute force idea is to calculate every seller's total revenue manually and then compare all sellers repeatedly to determine the maximum.

We could first compute total sales per seller using aggregation. Then, for each seller, we compare their total revenue against every other seller to check whether anyone has a higher value. If no seller exceeds the current seller's revenue, that seller belongs in the answer.

This approach works because every seller is exhaustively compared against all others, guaranteeing that only sellers with the global maximum total are returned.

However, this method is inefficient because it introduces unnecessary repeated comparisons. If there are n sellers, comparing every seller with every other seller leads to quadratic work.

Optimal Approach

The key insight is that we do not need pairwise comparisons. Instead, we can compute all seller totals once, determine the maximum total once, and then return all sellers whose totals equal that maximum.

This naturally maps to SQL aggregation:

  1. Use GROUP BY seller_id to compute total revenue for each seller.
  2. Find the maximum total revenue.
  3. Select all sellers whose revenue equals that maximum.

A common SQL pattern is to place the grouped totals into a subquery or Common Table Expression (CTE). Then we filter sellers by comparing their total against the global maximum.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Computes totals, then compares every seller against every other seller
Optimal O(n) O(n) Aggregates once and filters by the maximum total

Here, n represents the number of rows in the Sales table.

Algorithm Walkthrough

  1. Group the Sales table by seller_id and compute the sum of price for each seller. This gives us the total sales revenue generated by every seller.
  2. Determine the maximum total revenue among all sellers. This value represents the highest sales performance.
  3. Filter the grouped results to include only sellers whose total revenue equals the maximum value.
  4. Return the seller_id values for those sellers. Since the problem allows any order, no sorting is required.

Why it works

The algorithm works because every seller's total revenue is computed exactly once through aggregation. After finding the global maximum, selecting sellers whose totals match that maximum guarantees correctness. Any seller with a smaller total is excluded, while every tied maximum seller is included.

Python Solution

LeetCode database problems use SQL rather than a Python class method. Below is a complete, LeetCode-submittable SQL solution.

SELECT seller_id
FROM (
    SELECT
        seller_id,
        SUM(price) AS total_sales
    FROM Sales
    GROUP BY seller_id
) AS seller_totals
WHERE total_sales = (
    SELECT MAX(total_sales)
    FROM (
        SELECT
            seller_id,
            SUM(price) AS total_sales
        FROM Sales
        GROUP BY seller_id
    ) AS max_totals
);

The solution begins by grouping the Sales table by seller_id. For each seller, SUM(price) calculates total sales revenue.

The grouped result is stored in a subquery named seller_totals. This gives us a table containing one row per seller and their corresponding total revenue.

Next, we compute the highest total revenue using another aggregation query. The inner subquery again calculates seller totals, and MAX(total_sales) extracts the largest value.

Finally, the outer query filters sellers whose total_sales equals this maximum. Since ties are allowed, multiple sellers may be returned.

Go Solution

Since this is a database problem, Go is not applicable in the standard LeetCode environment. The accepted submission format is SQL.

For completeness, here is the same logic written as a SQL string in Go syntax:

package main

func salesAnalysis() string {
	return `
SELECT seller_id
FROM (
    SELECT
        seller_id,
        SUM(price) AS total_sales
    FROM Sales
    GROUP BY seller_id
) AS seller_totals
WHERE total_sales = (
    SELECT MAX(total_sales)
    FROM (
        SELECT
            seller_id,
            SUM(price) AS total_sales
        FROM Sales
        GROUP BY seller_id
    ) AS max_totals
);
`
}

The Go version simply stores the SQL query as a raw string literal. Unlike algorithmic problems, there are no Go-specific data structures or implementation details to discuss because execution happens inside the SQL engine.

Worked Examples

Example 1

Input:

Sales Table

seller_id price
1 2000
1 800
2 800
3 2800

Step 1: Compute Total Revenue Per Seller

We group rows by seller_id and sum price.

seller_id Running Total
1 2000
1 2800
2 800
3 2800

Final aggregated totals:

seller_id total_sales
1 2800
2 800
3 2800

Step 2: Find Maximum Revenue

The maximum total_sales value is:

2800

Step 3: Select Matching Sellers

We return all sellers whose revenue equals 2800.

seller_id total_sales
1 2800
3 2800

Output:

+-------------+
| seller_id   |
+-------------+
| 1           |
| 3           |
+-------------+

Complexity Analysis

Measure Complexity Explanation
Time O(n) We scan the Sales table to aggregate seller totals
Space O(n) Aggregated seller totals may require storage proportional to the number of sellers

The dominant operation is grouping the Sales table by seller. Computing sums and finding the maximum both happen in linear time relative to the number of sales rows. The SQL engine may internally allocate memory for grouped results, leading to linear auxiliary space in the number of unique sellers.

Test Cases

Since this is a SQL problem, test cases are best represented conceptually.

# Example 1, tie between two sellers
sales = [
    (1, 2000),
    (1, 800),
    (2, 800),
    (3, 2800),
]
assert sorted([1, 3]) == [1, 3]  # sellers 1 and 3 tie

# Single seller
sales = [
    (1, 1000),
]
assert [1] == [1]  # only seller exists

# One clear winner
sales = [
    (1, 1000),
    (2, 2000),
    (1, 500),
]
assert [2] == [2]  # seller 2 has highest total

# Multiple sellers tied
sales = [
    (1, 1000),
    (2, 1000),
    (3, 1000),
]
assert sorted([1, 2, 3]) == [1, 2, 3]  # all tied

# Repeated rows count normally
sales = [
    (1, 500),
    (1, 500),
    (2, 900),
]
assert [1] == [1]  # duplicates contribute to total

Test Summary

Test Why
Example input with tie Verifies multiple maximum sellers are returned
Single seller Validates minimum valid input
One clear winner Ensures highest total is selected correctly
All sellers tied Confirms every maximum seller is included
Repeated rows Verifies duplicates are counted rather than removed

Edge Cases

Multiple Sellers Tie for Maximum Revenue

One easy mistake is returning only a single seller with the maximum revenue. In the example problem, both seller 1 and seller 3 have total revenue 2800. The implementation handles this correctly by filtering all sellers whose total equals the maximum, rather than limiting the result to one row.

Sellers with Multiple Transactions

A seller may appear multiple times in the Sales table. A buggy implementation might only examine individual sales instead of aggregating totals. Using SUM(price) with GROUP BY seller_id guarantees all transactions are accumulated into a single total for each seller.

Repeated Rows in the Sales Table

The problem explicitly states that repeated rows may exist. Some implementations may accidentally remove duplicates or use DISTINCT, producing incorrect totals. This solution does not deduplicate records, ensuring every sale contributes to the seller's revenue exactly as intended.