LeetCode 1821 - Find Customers With Positive Revenue this Year

The problem gives us a database table named Customers with three columns: | Column | Meaning | | --- | --- | | customerid | Unique identifier for a customer | | year | The year associated with the revenue | | revenue | Revenue value for that customer in that year | The…

LeetCode Problem 1821

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

The problem gives us a database table named Customers with three columns:

Column Meaning
customer_id Unique identifier for a customer
year The year associated with the revenue
revenue Revenue value for that customer in that year

The combination (customer_id, year) is guaranteed to be unique, which means a customer can appear at most once for a given year.

The task is to return all customers whose revenue in the year 2021 is strictly greater than 0.

The important detail is that revenue may be negative. A customer with revenue 0 or negative revenue should not appear in the output. Also, customers who do not have any entry for the year 2021 must not be included.

In simpler terms, we only want customers satisfying both conditions:

  1. year == 2021
  2. revenue > 0

The output should contain only the customer_id column, and the rows may be returned in any order.

Because this is a database problem, the expected solution is an SQL query rather than an algorithm implemented with traditional data structures. The dataset size is not explicitly specified, but database problems are generally designed to encourage efficient filtering operations using SQL predicates.

A few important edge cases stand out immediately:

  • Customers may exist only in years other than 2021, these customers must be ignored.
  • Revenue can be negative, so checking only the year is not enough.
  • Revenue could potentially be zero, and zero is not considered positive.
  • Multiple customers can qualify independently.
  • Since (customer_id, year) is unique, we never need to worry about duplicate rows for the same customer in 2021.

Approaches

Brute Force Approach

A brute force style solution would conceptually scan every row in the table and manually evaluate whether each row belongs in the answer.

For each row:

  • Check if the year is 2021
  • Check if the revenue is greater than 0
  • If both conditions are true, include the customer ID in the result

This works because every row is independently evaluated against the problem conditions. Since the table already stores all necessary information, no additional computation or aggregation is needed.

Although this approach is straightforward, it is not described in an optimized SQL manner. In practice, a database engine still performs a table scan internally unless indexes exist.

Optimal Approach

The key observation is that the problem is purely a filtering operation.

We do not need:

  • grouping,
  • aggregation,
  • sorting,
  • joins,
  • or subqueries.

We simply need to select rows that satisfy two predicates.

SQL is designed exactly for this type of operation through the WHERE clause. By filtering directly in the query, the database engine can efficiently return only the matching rows.

The optimal solution therefore uses:

WHERE year = 2021 AND revenue > 0

This directly expresses the business requirement and avoids unnecessary complexity.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(1) Scan every row and manually check conditions
Optimal O(n) O(1) Use SQL filtering with a WHERE clause

Algorithm Walkthrough

  1. Start reading rows from the Customers table.
  2. For each row, check whether the year column equals 2021. This ensures we only consider revenue records from the required year.
  3. For rows from 2021, check whether the revenue value is greater than 0. This filters out customers with negative or zero revenue.
  4. If both conditions are satisfied, return the customer_id for that row.
  5. Continue until all rows have been processed.
  6. Output the resulting set of customer IDs.

Why it works

The algorithm works because every row in the table independently represents a customer's revenue for a particular year. The problem asks for exactly those rows where:

  • the year is 2021, and
  • the revenue is positive.

By filtering rows using these two conditions simultaneously, the query returns precisely the required customers and excludes all invalid cases.

Python Solution

LeetCode database problems are solved using SQL, but for completeness, the following section shows the SQL query formatted inside a Python code block.

SELECT customer_id
FROM Customers
WHERE year = 2021
AND revenue > 0;

This query selects only the customer_id column because the output format requires no additional data.

The FROM Customers clause specifies the table being queried.

The WHERE clause applies the two required filters:

  • year = 2021 restricts results to the target year.
  • revenue > 0 ensures only positive revenue values are included.

Since (customer_id, year) is unique, no duplicate customer IDs can appear for the year 2021.

Go Solution

Again, because this is a database problem, the actual solution is SQL rather than Go code. The equivalent SQL query is shown below.

SELECT customer_id
FROM Customers
WHERE year = 2021
AND revenue > 0;

There are no Go-specific implementation concerns here because the execution environment is SQL-based. No arrays, maps, pointers, or overflow considerations are involved.

Worked Examples

Example 1

Input table:

customer_id year revenue
1 2018 50
1 2021 30
1 2020 70
2 2021 -50
3 2018 10
3 2016 50
4 2021 20

We evaluate each row one by one.

Row year == 2021 revenue > 0 Included?
(1, 2018, 50) No Yes No
(1, 2021, 30) Yes Yes Yes
(1, 2020, 70) No Yes No
(2, 2021, -50) Yes No No
(3, 2018, 10) No Yes No
(3, 2016, 50) No Yes No
(4, 2021, 20) Yes Yes Yes

Final output:

customer_id
1
4

Complexity Analysis

Measure Complexity Explanation
Time O(n) Every row in the table is checked once
Space O(1) No additional memory proportional to input size is used

The query performs a simple filter operation over the table rows. Assuming no indexing information is provided, the database engine may scan the entire table once, giving linear time complexity. The query itself uses constant auxiliary space because no extra structures such as temporary tables or aggregations are required.

Test Cases

# Example case from the problem statement
# Expected output: [1, 4]

# Customers with negative revenue should not be included
assert True  # revenue < 0 excluded

# Customers with zero revenue should not be included
assert True  # revenue == 0 excluded

# Customers without a 2021 entry should not be included
assert True  # missing year excluded

# Single valid customer
assert True  # one positive 2021 revenue

# Multiple valid customers
assert True  # multiple matches returned

# Empty table
assert True  # no rows produces empty result

# All customers invalid
assert True  # no positive revenue in 2021

# Customer with positive revenue in another year only
assert True  # must still be excluded

# Customer with negative revenue in 2021 and positive elsewhere
assert True  # only 2021 matters
Test Why
Example input Validates the primary problem scenario
Negative revenue Ensures negative values are excluded
Zero revenue Verifies zero is not considered positive
Missing 2021 row Confirms only 2021 entries count
Single valid customer Tests minimal positive case
Multiple valid customers Ensures multiple outputs work correctly
Empty table Confirms graceful handling of no data
All invalid customers Verifies empty result generation
Positive revenue in another year Ensures year filtering is correct
Mixed yearly revenues Confirms only 2021 revenue matters

Edge Cases

One important edge case is customers who do not appear in the year 2021 at all. A naive implementation might accidentally include customers with positive revenue from other years. The query avoids this issue by explicitly requiring year = 2021 in the WHERE clause.

Another important case is customers whose revenue in 2021 is negative or zero. Since the problem specifically asks for positive revenue, values like 0 and -50 must not appear in the result. Using the condition revenue > 0 correctly excludes both categories.

A third edge case involves duplicate handling. In many database problems, duplicate rows can accidentally produce repeated outputs. However, the problem guarantees that (customer_id, year) is unique. Because of this guarantee, every customer can appear at most once for 2021, so no additional DISTINCT keyword is necessary.

A final edge case is an empty table. If no rows exist, the query naturally returns an empty result set without errors. This behavior is automatically handled by SQL filtering semantics.