LeetCode 2292 - Products With Three or More Orders in Two Consecutive Years
This problem provides a table named Orders, where each row represents a single purchase event. Every order contains an orderid, a productid, the purchased quantity, and the purchasedate. The goal is to identify all products that satisfy two conditions simultaneously: 1.
Difficulty: 🟡 Medium
Topics: Database
Solution
Problem Understanding
This problem provides a table named Orders, where each row represents a single purchase event. Every order contains an order_id, a product_id, the purchased quantity, and the purchase_date.
The goal is to identify all products that satisfy two conditions simultaneously:
- The product was ordered at least three times in one year.
- The same product was also ordered at least three times in the immediately following or preceding year.
The phrase "two consecutive years" is the key requirement. It means we are not looking for products that simply have three orders in multiple unrelated years. The years must differ by exactly one.
It is important to notice that the problem counts orders, not quantities. Even if a single order has a large quantity, it still counts as only one order occurrence. Therefore, the quantity column is irrelevant for solving the problem.
For example, if a product appears:
- 3 times in 2020
- 3 times in 2021
then it qualifies because 2020 and 2021 are consecutive years.
However, if a product appears:
- 3 times in 2020
- 3 times in 2022
then it does not qualify because the years are not consecutive.
The expected output is a table containing only the qualifying product_id values. The order of the output rows does not matter.
The input guarantees that order_id values are unique, so each row represents a distinct order event. Since the problem only asks whether yearly order counts reach at least three, we can aggregate orders by (product_id, year) and ignore the individual order details afterward.
Several edge cases are important to consider:
- A product may have many orders in one year but fewer than three in adjacent years.
- A product may qualify across multiple pairs of consecutive years.
- A product may have exactly three orders in consecutive years, which should still qualify.
- A product may have many orders spread across non-consecutive years, which should not qualify.
- Multiple products may satisfy the condition simultaneously.
Approaches
Brute Force Approach
A brute force solution would examine every product independently and compare every pair of years for that product.
First, we would gather all orders for each product and count how many orders occurred in each year. Then, for every product, we would iterate through all year pairs and check whether:
- both years have at least three orders
- the years differ by exactly one
This approach is correct because it explicitly verifies the condition required by the problem. However, it performs unnecessary comparisons. If a product has many distinct years, checking every possible pair becomes inefficient.
For example, if a product appears across k years, comparing every pair requires O(k²) time for that product.
Optimal Approach
The key observation is that we only care about adjacent years. There is no need to compare every pair of years.
Instead, we can:
- Count the number of orders for each
(product_id, year)pair. - Keep only those pairs where the count is at least three.
- For each qualifying
(product_id, year), check whether(product_id, year + 1)also exists.
This transforms the problem into a simple lookup problem using a hash set or grouped aggregation.
In SQL, this is naturally implemented using:
GROUP BYto count yearly orders- a self join to compare consecutive years
This approach avoids quadratic comparisons and efficiently checks only the necessary adjacent years.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n + k²) | O(k) | Compares all year pairs for each product |
| Optimal | O(n) | O(k) | Uses grouped counts and direct consecutive-year lookup |
Here:
nis the number of orderskis the number of unique(product_id, year)pairs
Algorithm Walkthrough
- Extract the year from each
purchase_date.
Since the condition is based on yearly order counts, we first convert each order date into its corresponding year.
2. Group records by (product_id, year).
For every product and year combination, count how many orders occurred during that year. 3. Filter groups with at least three orders.
We only care about years where a product was ordered three or more times. Any smaller count can never contribute to a valid answer.
4. Create a derived table containing only qualifying (product_id, year) pairs.
After filtering, each row represents a year in which a product satisfied the minimum order requirement. 5. Self join the derived table.
Join the table with itself using:
- same
product_id - consecutive years condition:
year1 + 1 = year2
This directly checks whether the product satisfied the requirement in adjacent years. 6. Return distinct product IDs.
A product may satisfy the condition across multiple consecutive year pairs, but the output should contain each product only once.
Why it works
The algorithm works because it first identifies every year in which a product independently satisfies the "three or more orders" condition. After that, it only needs to determine whether any two qualifying years are consecutive. The self join guarantees that only products with adjacent qualifying years are returned.
Python Solution
# complete solution here
import pandas as pd
def find_products(orders: pd.DataFrame) -> pd.DataFrame:
# Extract year from purchase_date
orders["year"] = pd.to_datetime(orders["purchase_date"]).dt.year
# Count orders per product per year
yearly_counts = (
orders.groupby(["product_id", "year"])
.size()
.reset_index(name="order_count")
)
# Keep only years with at least 3 orders
qualified_years = yearly_counts[yearly_counts["order_count"] >= 3]
# Self join on consecutive years
merged = qualified_years.merge(
qualified_years,
left_on=["product_id", "year"],
right_on=["product_id", "year"],
suffixes=("_left", "_right")
)
# Filter consecutive years
result = merged[
merged["year_right"] == merged["year_left"] + 1
]
# Return distinct product IDs
return result[["product_id"]].drop_duplicates()
The implementation begins by extracting the year component from purchase_date. This allows the solution to aggregate data at the yearly level.
Next, the code groups records by both product_id and year. The .size() operation counts how many orders belong to each group.
After computing yearly counts, the solution filters out any groups with fewer than three orders because those years cannot contribute to a valid answer.
The self join is the core step. It matches qualifying years belonging to the same product. The final filter checks whether the second year is exactly one greater than the first year.
Finally, duplicate product IDs are removed before returning the result.
Go Solution
// complete solution here
SELECT DISTINCT t1.product_id
FROM (
SELECT
product_id,
YEAR(purchase_date) AS year,
COUNT(*) AS order_count
FROM Orders
GROUP BY product_id, YEAR(purchase_date)
HAVING COUNT(*) >= 3
) t1
JOIN (
SELECT
product_id,
YEAR(purchase_date) AS year,
COUNT(*) AS order_count
FROM Orders
GROUP BY product_id, YEAR(purchase_date)
HAVING COUNT(*) >= 3
) t2
ON t1.product_id = t2.product_id
AND t1.year + 1 = t2.year;
This problem is a database problem, so the "Go solution" section is actually implemented using SQL. The query uses two identical derived tables and joins them on:
- the same product
- consecutive years
The DISTINCT keyword ensures that each qualifying product appears only once in the final result.
Worked Examples
Example 1
Input:
| order_id | product_id | purchase_date |
|---|---|---|
| 1 | 1 | 2020-03-16 |
| 2 | 1 | 2020-12-02 |
| 3 | 1 | 2020-05-10 |
| 4 | 1 | 2021-12-23 |
| 5 | 1 | 2021-05-21 |
| 6 | 1 | 2021-10-11 |
| 7 | 2 | 2022-10-11 |
Step 1, Extract Years
| product_id | year |
|---|---|
| 1 | 2020 |
| 1 | 2020 |
| 1 | 2020 |
| 1 | 2021 |
| 1 | 2021 |
| 1 | 2021 |
| 2 | 2022 |
Step 2, Count Orders Per Product Per Year
| product_id | year | order_count |
|---|---|---|
| 1 | 2020 | 3 |
| 1 | 2021 | 3 |
| 2 | 2022 | 1 |
Step 3, Keep Only Counts >= 3
| product_id | year |
|---|---|
| 1 | 2020 |
| 1 | 2021 |
Step 4, Check Consecutive Years
We compare:
(1, 2020)with(1, 2021)
Since 2021 = 2020 + 1, product 1 qualifies.
Final Result
| product_id |
|---|
| 1 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each order is processed a constant number of times |
| Space | O(k) | Stores grouped yearly counts |
Here:
nis the number of rows inOrderskis the number of unique(product_id, year)pairs
The grouping operation processes every row once. The self join only operates on aggregated yearly records, which is significantly smaller than the original dataset in practice.
Test Cases
# test cases
import pandas as pd
def run_tests():
# Example case
orders = pd.DataFrame({
"order_id": [1,2,3,4,5,6,7],
"product_id": [1,1,1,1,1,1,2],
"quantity": [7,4,7,6,5,6,6],
"purchase_date": [
"2020-03-16",
"2020-12-02",
"2020-05-10",
"2021-12-23",
"2021-05-21",
"2021-10-11",
"2022-10-11"
]
})
result = find_products(orders)
assert set(result["product_id"]) == {1} # provided example
# Exactly three orders in consecutive years
orders = pd.DataFrame({
"order_id": [1,2,3,4,5,6],
"product_id": [1,1,1,1,1,1],
"quantity": [1,1,1,1,1,1],
"purchase_date": [
"2020-01-01",
"2020-02-01",
"2020-03-01",
"2021-01-01",
"2021-02-01",
"2021-03-01"
]
})
result = find_products(orders)
assert set(result["product_id"]) == {1} # minimum valid case
# Non-consecutive years
orders = pd.DataFrame({
"order_id": [1,2,3,4,5,6],
"product_id": [1,1,1,1,1,1],
"quantity": [1,1,1,1,1,1],
"purchase_date": [
"2020-01-01",
"2020-02-01",
"2020-03-01",
"2022-01-01",
"2022-02-01",
"2022-03-01"
]
})
result = find_products(orders)
assert result.empty # years are not consecutive
# Multiple valid products
orders = pd.DataFrame({
"order_id": list(range(1, 13)),
"product_id": [1]*6 + [2]*6,
"quantity": [1]*12,
"purchase_date": [
"2020-01-01","2020-02-01","2020-03-01",
"2021-01-01","2021-02-01","2021-03-01",
"2021-01-01","2021-02-01","2021-03-01",
"2022-01-01","2022-02-01","2022-03-01"
]
})
result = find_products(orders)
assert set(result["product_id"]) == {1, 2} # multiple answers
# Large counts but one year below threshold
orders = pd.DataFrame({
"order_id": [1,2,3,4,5],
"product_id": [1,1,1,1,1],
"quantity": [1,1,1,1,1],
"purchase_date": [
"2020-01-01",
"2020-02-01",
"2020-03-01",
"2021-01-01",
"2021-02-01"
]
})
result = find_products(orders)
assert result.empty # second year has only two orders
run_tests()
| Test | Why |
|---|---|
| Provided example | Validates the core problem requirement |
| Exactly three orders | Tests the minimum qualifying threshold |
| Non-consecutive years | Ensures adjacency is required |
| Multiple valid products | Confirms multiple answers are handled |
| One year below threshold | Verifies filtering logic |
Edge Cases
One important edge case occurs when a product has many orders across several years, but none of the qualifying years are consecutive. For example, a product might have three orders in 2019 and three orders in 2021. A naive implementation that only checks for multiple qualifying years could incorrectly include it. The self join condition year + 1 prevents this mistake.
Another edge case involves products with exactly three orders in each consecutive year. Some implementations mistakenly use > instead of >= during filtering. This would incorrectly reject valid products. The solution explicitly uses HAVING COUNT(*) >= 3, ensuring that the threshold is inclusive.
A third important case occurs when a product qualifies across multiple consecutive year pairs. For example, a product may have:
- 3 orders in 2020
- 3 orders in 2021
- 3 orders in 2022
This produces two valid pairs:
(2020, 2021)(2021, 2022)
Without DISTINCT, the product could appear multiple times in the result. The solution handles this correctly by removing duplicates before returning the final output.