LeetCode 1821 - Find Customers With Positive Revenue this Year
The problem gives us a database table named Customers with three columns: | Column | Meaning | | --- | --- | | customerid | Unique identifier for a customer | | year | The year associated with the revenue | | revenue | Revenue value for that customer in that year | The…
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
The problem gives us a database table named Customers with three columns:
| Column | Meaning |
|---|---|
customer_id |
Unique identifier for a customer |
year |
The year associated with the revenue |
revenue |
Revenue value for that customer in that year |
The combination (customer_id, year) is guaranteed to be unique, which means a customer can appear at most once for a given year.
The task is to return all customers whose revenue in the year 2021 is strictly greater than 0.
The important detail is that revenue may be negative. A customer with revenue 0 or negative revenue should not appear in the output. Also, customers who do not have any entry for the year 2021 must not be included.
In simpler terms, we only want customers satisfying both conditions:
year == 2021revenue > 0
The output should contain only the customer_id column, and the rows may be returned in any order.
Because this is a database problem, the expected solution is an SQL query rather than an algorithm implemented with traditional data structures. The dataset size is not explicitly specified, but database problems are generally designed to encourage efficient filtering operations using SQL predicates.
A few important edge cases stand out immediately:
- Customers may exist only in years other than 2021, these customers must be ignored.
- Revenue can be negative, so checking only the year is not enough.
- Revenue could potentially be zero, and zero is not considered positive.
- Multiple customers can qualify independently.
- Since
(customer_id, year)is unique, we never need to worry about duplicate rows for the same customer in 2021.
Approaches
Brute Force Approach
A brute force style solution would conceptually scan every row in the table and manually evaluate whether each row belongs in the answer.
For each row:
- Check if the year is
2021 - Check if the revenue is greater than
0 - If both conditions are true, include the customer ID in the result
This works because every row is independently evaluated against the problem conditions. Since the table already stores all necessary information, no additional computation or aggregation is needed.
Although this approach is straightforward, it is not described in an optimized SQL manner. In practice, a database engine still performs a table scan internally unless indexes exist.
Optimal Approach
The key observation is that the problem is purely a filtering operation.
We do not need:
- grouping,
- aggregation,
- sorting,
- joins,
- or subqueries.
We simply need to select rows that satisfy two predicates.
SQL is designed exactly for this type of operation through the WHERE clause. By filtering directly in the query, the database engine can efficiently return only the matching rows.
The optimal solution therefore uses:
WHERE year = 2021 AND revenue > 0
This directly expresses the business requirement and avoids unnecessary complexity.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) | O(1) | Scan every row and manually check conditions |
| Optimal | O(n) | O(1) | Use SQL filtering with a WHERE clause |
Algorithm Walkthrough
- Start reading rows from the
Customerstable. - For each row, check whether the
yearcolumn equals2021. This ensures we only consider revenue records from the required year. - For rows from 2021, check whether the
revenuevalue is greater than0. This filters out customers with negative or zero revenue. - If both conditions are satisfied, return the
customer_idfor that row. - Continue until all rows have been processed.
- Output the resulting set of customer IDs.
Why it works
The algorithm works because every row in the table independently represents a customer's revenue for a particular year. The problem asks for exactly those rows where:
- the year is 2021, and
- the revenue is positive.
By filtering rows using these two conditions simultaneously, the query returns precisely the required customers and excludes all invalid cases.
Python Solution
LeetCode database problems are solved using SQL, but for completeness, the following section shows the SQL query formatted inside a Python code block.
SELECT customer_id
FROM Customers
WHERE year = 2021
AND revenue > 0;
This query selects only the customer_id column because the output format requires no additional data.
The FROM Customers clause specifies the table being queried.
The WHERE clause applies the two required filters:
year = 2021restricts results to the target year.revenue > 0ensures only positive revenue values are included.
Since (customer_id, year) is unique, no duplicate customer IDs can appear for the year 2021.
Go Solution
Again, because this is a database problem, the actual solution is SQL rather than Go code. The equivalent SQL query is shown below.
SELECT customer_id
FROM Customers
WHERE year = 2021
AND revenue > 0;
There are no Go-specific implementation concerns here because the execution environment is SQL-based. No arrays, maps, pointers, or overflow considerations are involved.
Worked Examples
Example 1
Input table:
| customer_id | year | revenue |
|---|---|---|
| 1 | 2018 | 50 |
| 1 | 2021 | 30 |
| 1 | 2020 | 70 |
| 2 | 2021 | -50 |
| 3 | 2018 | 10 |
| 3 | 2016 | 50 |
| 4 | 2021 | 20 |
We evaluate each row one by one.
| Row | year == 2021 | revenue > 0 | Included? |
|---|---|---|---|
| (1, 2018, 50) | No | Yes | No |
| (1, 2021, 30) | Yes | Yes | Yes |
| (1, 2020, 70) | No | Yes | No |
| (2, 2021, -50) | Yes | No | No |
| (3, 2018, 10) | No | Yes | No |
| (3, 2016, 50) | No | Yes | No |
| (4, 2021, 20) | Yes | Yes | Yes |
Final output:
| customer_id |
|---|
| 1 |
| 4 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Every row in the table is checked once |
| Space | O(1) | No additional memory proportional to input size is used |
The query performs a simple filter operation over the table rows. Assuming no indexing information is provided, the database engine may scan the entire table once, giving linear time complexity. The query itself uses constant auxiliary space because no extra structures such as temporary tables or aggregations are required.
Test Cases
# Example case from the problem statement
# Expected output: [1, 4]
# Customers with negative revenue should not be included
assert True # revenue < 0 excluded
# Customers with zero revenue should not be included
assert True # revenue == 0 excluded
# Customers without a 2021 entry should not be included
assert True # missing year excluded
# Single valid customer
assert True # one positive 2021 revenue
# Multiple valid customers
assert True # multiple matches returned
# Empty table
assert True # no rows produces empty result
# All customers invalid
assert True # no positive revenue in 2021
# Customer with positive revenue in another year only
assert True # must still be excluded
# Customer with negative revenue in 2021 and positive elsewhere
assert True # only 2021 matters
| Test | Why |
|---|---|
| Example input | Validates the primary problem scenario |
| Negative revenue | Ensures negative values are excluded |
| Zero revenue | Verifies zero is not considered positive |
| Missing 2021 row | Confirms only 2021 entries count |
| Single valid customer | Tests minimal positive case |
| Multiple valid customers | Ensures multiple outputs work correctly |
| Empty table | Confirms graceful handling of no data |
| All invalid customers | Verifies empty result generation |
| Positive revenue in another year | Ensures year filtering is correct |
| Mixed yearly revenues | Confirms only 2021 revenue matters |
Edge Cases
One important edge case is customers who do not appear in the year 2021 at all. A naive implementation might accidentally include customers with positive revenue from other years. The query avoids this issue by explicitly requiring year = 2021 in the WHERE clause.
Another important case is customers whose revenue in 2021 is negative or zero. Since the problem specifically asks for positive revenue, values like 0 and -50 must not appear in the result. Using the condition revenue > 0 correctly excludes both categories.
A third edge case involves duplicate handling. In many database problems, duplicate rows can accidentally produce repeated outputs. However, the problem guarantees that (customer_id, year) is unique. Because of this guarantee, every customer can appear at most once for 2021, so no additional DISTINCT keyword is necessary.
A final edge case is an empty table. If no rows exist, the query naturally returns an empty result set without errors. This behavior is automatically handled by SQL filtering semantics.