LeetCode 2228 - Users With Two Purchases Within Seven Days
The problem requires identifying users who made at least two purchases within a 7-day window. We are given a Purchases table with purchaseid, userid, and purchasedate. Each row represents a single purchase made by a user.
Difficulty: š” Medium
Topics: Database
Solution
Problem Understanding
The problem requires identifying users who made at least two purchases within a 7-day window. We are given a Purchases table with purchase_id, user_id, and purchase_date. Each row represents a single purchase made by a user. The goal is to return the IDs of users who had two or more purchases where the difference between any two purchases is at most 7 days, including the scenario where two purchases happen on the same day.
The input represents all purchase events for all users, with purchase_id being unique, while user_id may appear multiple times. The expected output is a single-column table of user_id, sorted in ascending order, containing only those users who satisfy the two-purchases-in-seven-days condition.
Key constraints include:
purchase_dateis aDATEtype, so date arithmetic must account for exact day differences.- A user may have multiple purchases on the same day, which counts as satisfying the 7-day condition.
- The table may contain many rows, so a naive approach that compares all pairs of purchases for all users could be too slow.
Important edge cases are:
- Users with only one purchase should be ignored.
- Users with multiple purchases on the same day should be included.
- Users with multiple purchases spaced out beyond 7 days should be ignored.
Approaches
The brute-force approach would involve comparing every purchase with every other purchase for the same user. For each user, we would iterate over all pairs of purchases, calculate the day difference, and check if it is at most 7. This approach is correct because it examines all possible pairs, but it is inefficient, especially if a user has many purchases, because it results in O(n²) comparisons per user.
The key insight for an optimal solution is to leverage self-joins in SQL. By joining the Purchases table with itself on user_id and filtering on purchase_date differences between 0 and 7, we can find all pairs of purchases within 7 days efficiently. Using DISTINCT ensures each user appears only once in the output. Ordering by user_id satisfies the requirement of the problem.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) per user | O(1) | Compare all pairs of purchases for each user |
| Optimal (Self-Join) | O(n log n) | O(n) | Join table with itself on user_id and filter by date difference |
Algorithm Walkthrough
- Start by selecting from the
Purchasestable and alias it asp1. - Perform a self-join with
Purchasesasp2on the condition thatp1.user_id = p2.user_id. - Filter the join with
p2.purchase_date >= p1.purchase_dateto avoid duplicate or reversed pairs. - Add an additional filter that the difference between
p2.purchase_dateandp1.purchase_dateis between 0 and 7 days. This ensures that only purchases within 7 days are considered. - Select distinct
user_idvalues from the filtered join to remove duplicates. - Order the result by
user_idin ascending order.
Why it works: This approach guarantees that all pairs of purchases for a given user are examined exactly once, and by constraining the date difference to 0-7 days, we only capture users meeting the problem requirement. Using DISTINCT ensures that each user appears only once in the result, even if multiple qualifying pairs exist.
Python Solution
import sqlite3
def users_with_two_purchases_within_seven_days(conn: sqlite3.Connection):
query = """
SELECT DISTINCT p1.user_id
FROM Purchases p1
JOIN Purchases p2
ON p1.user_id = p2.user_id
AND p2.purchase_date >= p1.purchase_date
AND julianday(p2.purchase_date) - julianday(p1.purchase_date) <= 7
ORDER BY p1.user_id
"""
return conn.execute(query).fetchall()
The Python solution uses SQLite syntax with julianday() for date difference calculations. The self-join identifies all qualifying pairs of purchases for each user, and DISTINCT ensures each user appears only once in the output. Finally, ORDER BY sorts the results by user_id.
Go Solution
package main
import (
"database/sql"
_ "github.com/mattn/go-sqlite3"
"fmt"
)
func UsersWithTwoPurchasesWithinSevenDays(db *sql.DB) ([]int, error) {
query := `
SELECT DISTINCT p1.user_id
FROM Purchases p1
JOIN Purchases p2
ON p1.user_id = p2.user_id
AND p2.purchase_date >= p1.purchase_date
AND julianday(p2.purchase_date) - julianday(p1.purchase_date) <= 7
ORDER BY p1.user_id
`
rows, err := db.Query(query)
if err != nil {
return nil, err
}
defer rows.Close()
var result []int
for rows.Next() {
var userID int
if err := rows.Scan(&userID); err != nil {
return nil, err
}
result = append(result, userID)
}
return result, nil
}
In Go, we perform the same SQL self-join. The main differences are using the Go database/sql package to execute the query and iterating over rows to scan user_id values into a slice. Error handling and resource cleanup are done explicitly.
Worked Examples
Example 1
Input Purchases Table
| purchase_id | user_id | purchase_date |
|---|---|---|
| 4 | 2 | 2022-03-13 |
| 1 | 5 | 2022-02-11 |
| 3 | 7 | 2022-06-19 |
| 6 | 2 | 2022-03-20 |
| 5 | 7 | 2022-06-19 |
| 2 | 2 | 2022-06-08 |
Step-by-step Execution
p1= first row (user 2, 2022-03-13),p2iterates over all rows with user_id=2:
- Compare with 2022-03-13 ā difference 0 ⤠7 ā match
- Compare with 2022-03-20 ā difference 7 ā match
- Compare with 2022-06-08 ā difference > 7 ā discard
p1= row (user 2, 2022-03-20),p2= 2022-06-08 ā difference > 7 ā discardp1= row (user 5) ā only one purchase ā discardp1= row (user 7, 2022-06-19),p2= 2022-06-19 ā difference 0 ā match
Result Table
| user_id |
|---|
| 2 |
| 7 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Sorting may occur implicitly in database engine; join scales roughly linearly with number of records per user. |
| Space | O(n) | Database engine stores intermediate join results; output set size proportional to number of users. |
The optimal approach avoids O(n²) brute-force pair comparisons, using SQL joins and filtering to efficiently identify qualifying users.
Test Cases
# Provided example
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == [(2,), (7,)] # basic example
# User with only one purchase
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == [] # single purchase ignored
# Multiple purchases same day
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == [(3,)] # same-day purchases count
# Purchases more than 7 days apart
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == [] # too far apart ignored
# Multiple qualifying pairs for a user
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == [(4,)] # only distinct user_id returned
| Test | Why |
|---|---|
| Provided example | Validates normal operation |
| Single purchase | Ensures users with <2 purchases are excluded |
| Same-day purchases | Validates edge case of 0-day difference |
| Purchases >7 days apart | Ensures filter logic is correct |
| Multiple qualifying pairs | Ensures DISTINCT prevents duplicates |
Edge Cases
The first edge case is a user making multiple purchases on the same day. This could be mishandled if the code incorrectly requires a strictly positive difference. By allowing p2.purchase_date >= p1.purchase_date and a 0-day difference, the implementation handles it correctly.
The second edge case is users with exactly two purchases, one on day 1 and one on day 8. A naive implementation might include them, but the date difference filter of ā¤7 days correctly