LeetCode 2228 - Users With Two Purchases Within Seven Days

The problem requires identifying users who made at least two purchases within a 7-day window. We are given a Purchases table with purchaseid, userid, and purchasedate. Each row represents a single purchase made by a user.

LeetCode Problem 2228

Difficulty: 🟔 Medium
Topics: Database

Solution

Problem Understanding

The problem requires identifying users who made at least two purchases within a 7-day window. We are given a Purchases table with purchase_id, user_id, and purchase_date. Each row represents a single purchase made by a user. The goal is to return the IDs of users who had two or more purchases where the difference between any two purchases is at most 7 days, including the scenario where two purchases happen on the same day.

The input represents all purchase events for all users, with purchase_id being unique, while user_id may appear multiple times. The expected output is a single-column table of user_id, sorted in ascending order, containing only those users who satisfy the two-purchases-in-seven-days condition.

Key constraints include:

  1. purchase_date is a DATE type, so date arithmetic must account for exact day differences.
  2. A user may have multiple purchases on the same day, which counts as satisfying the 7-day condition.
  3. The table may contain many rows, so a naive approach that compares all pairs of purchases for all users could be too slow.

Important edge cases are:

  • Users with only one purchase should be ignored.
  • Users with multiple purchases on the same day should be included.
  • Users with multiple purchases spaced out beyond 7 days should be ignored.

Approaches

The brute-force approach would involve comparing every purchase with every other purchase for the same user. For each user, we would iterate over all pairs of purchases, calculate the day difference, and check if it is at most 7. This approach is correct because it examines all possible pairs, but it is inefficient, especially if a user has many purchases, because it results in O(n²) comparisons per user.

The key insight for an optimal solution is to leverage self-joins in SQL. By joining the Purchases table with itself on user_id and filtering on purchase_date differences between 0 and 7, we can find all pairs of purchases within 7 days efficiently. Using DISTINCT ensures each user appears only once in the output. Ordering by user_id satisfies the requirement of the problem.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) per user O(1) Compare all pairs of purchases for each user
Optimal (Self-Join) O(n log n) O(n) Join table with itself on user_id and filter by date difference

Algorithm Walkthrough

  1. Start by selecting from the Purchases table and alias it as p1.
  2. Perform a self-join with Purchases as p2 on the condition that p1.user_id = p2.user_id.
  3. Filter the join with p2.purchase_date >= p1.purchase_date to avoid duplicate or reversed pairs.
  4. Add an additional filter that the difference between p2.purchase_date and p1.purchase_date is between 0 and 7 days. This ensures that only purchases within 7 days are considered.
  5. Select distinct user_id values from the filtered join to remove duplicates.
  6. Order the result by user_id in ascending order.

Why it works: This approach guarantees that all pairs of purchases for a given user are examined exactly once, and by constraining the date difference to 0-7 days, we only capture users meeting the problem requirement. Using DISTINCT ensures that each user appears only once in the result, even if multiple qualifying pairs exist.

Python Solution

import sqlite3

def users_with_two_purchases_within_seven_days(conn: sqlite3.Connection):
    query = """
    SELECT DISTINCT p1.user_id
    FROM Purchases p1
    JOIN Purchases p2
      ON p1.user_id = p2.user_id
     AND p2.purchase_date >= p1.purchase_date
     AND julianday(p2.purchase_date) - julianday(p1.purchase_date) <= 7
    ORDER BY p1.user_id
    """
    return conn.execute(query).fetchall()

The Python solution uses SQLite syntax with julianday() for date difference calculations. The self-join identifies all qualifying pairs of purchases for each user, and DISTINCT ensures each user appears only once in the output. Finally, ORDER BY sorts the results by user_id.

Go Solution

package main

import (
    "database/sql"
    _ "github.com/mattn/go-sqlite3"
    "fmt"
)

func UsersWithTwoPurchasesWithinSevenDays(db *sql.DB) ([]int, error) {
    query := `
    SELECT DISTINCT p1.user_id
    FROM Purchases p1
    JOIN Purchases p2
      ON p1.user_id = p2.user_id
     AND p2.purchase_date >= p1.purchase_date
     AND julianday(p2.purchase_date) - julianday(p1.purchase_date) <= 7
    ORDER BY p1.user_id
    `
    rows, err := db.Query(query)
    if err != nil {
        return nil, err
    }
    defer rows.Close()

    var result []int
    for rows.Next() {
        var userID int
        if err := rows.Scan(&userID); err != nil {
            return nil, err
        }
        result = append(result, userID)
    }
    return result, nil
}

In Go, we perform the same SQL self-join. The main differences are using the Go database/sql package to execute the query and iterating over rows to scan user_id values into a slice. Error handling and resource cleanup are done explicitly.

Worked Examples

Example 1

Input Purchases Table

purchase_id user_id purchase_date
4 2 2022-03-13
1 5 2022-02-11
3 7 2022-06-19
6 2 2022-03-20
5 7 2022-06-19
2 2 2022-06-08

Step-by-step Execution

  1. p1 = first row (user 2, 2022-03-13), p2 iterates over all rows with user_id=2:
  • Compare with 2022-03-13 → difference 0 ≤ 7 → match
  • Compare with 2022-03-20 → difference 7 → match
  • Compare with 2022-06-08 → difference > 7 → discard
  1. p1 = row (user 2, 2022-03-20), p2 = 2022-06-08 → difference > 7 → discard
  2. p1 = row (user 5) → only one purchase → discard
  3. p1 = row (user 7, 2022-06-19), p2 = 2022-06-19 → difference 0 → match

Result Table

user_id
2
7

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Sorting may occur implicitly in database engine; join scales roughly linearly with number of records per user.
Space O(n) Database engine stores intermediate join results; output set size proportional to number of users.

The optimal approach avoids O(n²) brute-force pair comparisons, using SQL joins and filtering to efficiently identify qualifying users.

Test Cases

# Provided example
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == [(2,), (7,)]  # basic example

# User with only one purchase
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == []  # single purchase ignored

# Multiple purchases same day
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == [(3,)]  # same-day purchases count

# Purchases more than 7 days apart
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == []  # too far apart ignored

# Multiple qualifying pairs for a user
assert users_with_two_purchases_within_seven_days(conn := sqlite3.connect(":memory:")) == [(4,)]  # only distinct user_id returned
Test Why
Provided example Validates normal operation
Single purchase Ensures users with <2 purchases are excluded
Same-day purchases Validates edge case of 0-day difference
Purchases >7 days apart Ensures filter logic is correct
Multiple qualifying pairs Ensures DISTINCT prevents duplicates

Edge Cases

The first edge case is a user making multiple purchases on the same day. This could be mishandled if the code incorrectly requires a strictly positive difference. By allowing p2.purchase_date >= p1.purchase_date and a 0-day difference, the implementation handles it correctly.

The second edge case is users with exactly two purchases, one on day 1 and one on day 8. A naive implementation might include them, but the date difference filter of ≤7 days correctly