LeetCode 1336 - Number of Transactions per Visit
The problem asks us to analyze two database tables, Visits and Transactions, and compute how many visitors performed a given number of transactions per visit. Each row in the Visits table represents a unique visit by a user on a specific date.
Difficulty: 🔴 Hard
Topics: Database
Solution
Problem Understanding
The problem asks us to analyze two database tables, Visits and Transactions, and compute how many visitors performed a given number of transactions per visit. Each row in the Visits table represents a unique visit by a user on a specific date. The Transactions table may contain multiple rows per user per date, indicating multiple transactions on the same visit.
The goal is to produce a summary table where each row corresponds to a specific number of transactions (transactions_count) and contains the number of visits that had exactly that many transactions (visits_count). We also need to include visits with zero transactions, and the transactions_count values must cover the entire range from 0 to the maximum observed number of transactions in a single visit. The result should be sorted by transactions_count.
Key details and constraints include:
- Each
(user_id, visit_date)inVisitsis unique. - Every transaction in
Transactionscorresponds to a valid visit inVisits. - There may be visits with zero transactions, which we must account for.
- Transaction counts per visit may skip numbers (for example, there may be 0, 1, 3 transactions but no visits with 2), yet the output must include all counts from 0 up to the maximum.
Edge cases include users with no transactions, visits with multiple transactions, and potential duplicate transaction rows for the same visit.
Approaches
A brute-force approach would involve iterating over every visit, counting the transactions for that visit by scanning the Transactions table, then building a frequency map for transaction counts. While correct, this is inefficient because it requires a nested scan over all transactions for each visit, leading to potentially O(V * T) time complexity, where V is the number of visits and T is the number of transactions.
The optimal approach leverages SQL aggregation and joins. By performing a LEFT JOIN of Visits with Transactions on (user_id, date) and then using COUNT grouped by (user_id, visit_date), we can calculate the number of transactions per visit efficiently. Finally, we aggregate these counts to determine how many visits had each transaction count, including zero.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(V * T) | O(V) | Iterates over every visit and scans transactions, correct but inefficient. |
| Optimal | O(V + T) | O(V + T) | Uses SQL aggregation with LEFT JOIN and GROUP BY to efficiently compute counts per visit and then frequency of counts. |
Algorithm Walkthrough
- Perform a
LEFT JOINbetweenVisitsandTransactionson(user_id, visit_date)to associate transactions with each visit. Visits with no transactions will haveNULLin the transaction column. - Use
COUNT(transaction_date)grouped by(user_id, visit_date)to computetransactions_countper visit. UsingCOUNT(column)ignores NULLs, so visits with no transactions will correctly count as 0. - Aggregate the results from step 2 to compute how many visits have each
transactions_count. UseGROUP BY transactions_countandCOUNT(*)to get thevisits_count. - Generate a series of numbers from 0 to the maximum
transactions_countobserved to ensure all transaction counts are included.LEFT JOINthis series with the aggregated counts to fill in missing counts with 0. - Order the final results by
transactions_count.
Why it works: Each visit is uniquely identified, and by counting transactions per visit and then summarizing the counts, we capture the distribution of transactions per visit. Using a series ensures no transaction count is omitted, fulfilling the requirement to include zero and any skipped counts.
Python Solution
class Solution:
def numberOfTransactionsPerVisit(self, visits: List[Tuple[int, str]], transactions: List[Tuple[int, str, int]]) -> List[Tuple[int, int]]:
from collections import defaultdict
# Count transactions per visit
visit_tx_count = defaultdict(int)
for user_id, visit_date, *_ in transactions:
visit_tx_count[(user_id, visit_date)] += 1
# Count visits per transaction count
tx_count_freq = defaultdict(int)
for user_id, visit_date in visits:
count = visit_tx_count.get((user_id, visit_date), 0)
tx_count_freq[count] += 1
max_count = max(tx_count_freq.keys(), default=0)
result = [(i, tx_count_freq.get(i, 0)) for i in range(max_count + 1)]
return result
In this Python implementation, we first build a dictionary visit_tx_count mapping each (user_id, visit_date) to its number of transactions. We then count how many visits fall into each transaction count using tx_count_freq. Finally, we generate the full range from 0 to the maximum count to ensure every transaction count is represented.
Go Solution
package main
type Visit struct {
UserID int
VisitDate string
}
type Transaction struct {
UserID int
TransactionDate string
Amount int
}
func NumberOfTransactionsPerVisit(visits []Visit, transactions []Transaction) [][]int {
visitTxCount := make(map[string]int)
for _, tx := range transactions {
key := fmt.Sprintf("%d-%s", tx.UserID, tx.TransactionDate)
visitTxCount[key]++
}
txCountFreq := make(map[int]int)
maxCount := 0
for _, v := range visits {
key := fmt.Sprintf("%d-%s", v.UserID, v.VisitDate)
count := visitTxCount[key]
txCountFreq[count]++
if count > maxCount {
maxCount = count
}
}
result := make([][]int, maxCount+1)
for i := 0; i <= maxCount; i++ {
result[i] = []int{i, txCountFreq[i]}
}
return result
}
In Go, we use string keys in maps to represent (user_id, visit_date) pairs. This avoids creating a custom struct for map keys. The logic otherwise mirrors the Python version: first count transactions per visit, then count the number of visits for each transaction count, and finally generate a range from 0 to the maximum.
Worked Examples
Given the example from the problem statement:
Visits Table
| user_id | visit_date |
|---|---|
| 1 | 2020-01-01 |
| 2 | 2020-01-02 |
| 12 | 2020-01-01 |
| 19 | 2020-01-03 |
| 1 | 2020-01-02 |
| 2 | 2020-01-03 |
| 1 | 2020-01-04 |
| 7 | 2020-01-11 |
| 9 | 2020-01-25 |
| 8 | 2020-01-28 |
Transactions Table
| user_id | transaction_date | amount |
|---|---|---|
| 1 | 2020-01-02 | 120 |
| 2 | 2020-01-03 | 22 |
| 7 | 2020-01-11 | 232 |
| 1 | 2020-01-04 | 7 |
| 9 | 2020-01-25 | 33 |
| 9 | 2020-01-25 | 66 |
| 8 | 2020-01-28 | 1 |
| 9 | 2020-01-25 | 99 |
Step 1: Count transactions per visit:
| (user_id, visit_date) | transactions_count |
|---|---|
| (1, 2020-01-01) | 0 |
| (2, 2020-01-02) | 0 |
| (12, 2020-01-01) | 0 |
| (19, 2020-01-03) | 0 |
| (1, 2020-01-02) | 1 |
| (2, 2020-01-03) | 1 |
| (1, 2020-01-04) | 1 |
| (7, 2020-01-11) | 1 |
| (9, 2020-01-25) | 3 |
| (8, 2020-01-28) | 1 |
Step 2: Count visits per transaction count:
| transactions_count | visits_count |
|---|---|
| 0 | 4 |
| 1 | 5 |
| 2 | 0 |
| 3 | 1 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(V + T) | Count transactions per visit in O(T), then iterate over visits O(V). |
| Space | O(V + T) | Dictionary/map stores counts per visit (up to V) and frequencies (up to V). |
The dominant operations are linear scans over the input tables, and no nested iterations occur.
Test Cases
# test