LeetCode 262 - Trips and Users
This problem asks us to compute the daily cancellation rate for taxi trips over a fixed date range, specifically from "2013-10-01" to "2013-10-03".
Difficulty: 🔴 Hard
Topics: Database
Solution
Problem Understanding
This problem asks us to compute the daily cancellation rate for taxi trips over a fixed date range, specifically from "2013-10-01" to "2013-10-03".
We are given two database tables:
Trips, which stores information about every taxi trip requestUsers, which stores user metadata such as whether a user is banned
Each trip contains a client, a driver, a trip status, and the request date. The important detail is that both the client and the driver are represented by IDs that reference rows in the Users table.
The cancellation rate for a day is defined as:
$$\text{Cancellation Rate} = \frac{\text{Number of canceled trips with unbanned users}} {\text{Total number of trips with unbanned users}}$$
A trip counts only if:
- the client is not banned
- the driver is not banned
A trip is considered canceled if its status is either:
cancelled_by_drivercancelled_by_client
Trips with status completed are not cancellations.
The output should contain:
- one row per day
- the day itself
- the cancellation rate rounded to two decimal places
The problem guarantees that we only need to consider days that have at least one valid trip entry in the given range.
A subtle but important detail is that banned users invalidate the entire trip for the calculation. Even if only the client is banned, or only the driver is banned, the trip must be excluded from both the numerator and denominator.
The input scale in LeetCode database problems is usually moderate, but the goal is not raw performance through indexing tricks. Instead, the challenge is correctly expressing filtering, aggregation, and conditional counting in SQL.
Several edge cases can easily cause mistakes:
- Trips involving banned users must be ignored entirely
- Only canceled trips contribute to the numerator
- Completed trips still contribute to the denominator
- Cancellation rates must be rounded to exactly two decimal places
- Both client and driver must be checked independently
- Days with no valid trips should not appear
Approaches
Brute Force Approach
A brute force solution would process each trip individually and repeatedly query the Users table to determine whether the client and driver are banned.
Conceptually, the algorithm would work like this:
- Iterate through every trip in the date range.
- For each trip:
- look up the client in
Users - look up the driver in
Users
- Skip the trip if either user is banned.
- Otherwise:
- increment the total trip count for that day
- increment the canceled count if the status indicates cancellation
- After processing all trips, compute:
$$\text{canceled} / \text{total}$$
for each day.
This produces the correct answer because every trip is individually validated and categorized correctly.
However, this approach is inefficient because it repeatedly scans or searches the Users table for every trip. Without proper joins or indexing, this can become expensive as the dataset grows.
Optimal Approach
The key insight is that SQL is designed specifically for relational filtering and aggregation.
Instead of manually validating users one trip at a time, we can:
- Join the
Tripstable with theUserstable twice:
- once for the client
- once for the driver
- Filter out banned users directly in the query
- Group the remaining trips by day
- Use conditional aggregation to count canceled trips
The important observation is that cancellation counting is naturally expressible as a conditional sum:
SUM(status != 'completed')
or equivalently:
SUM(CASE WHEN status != 'completed' THEN 1 ELSE 0 END)
Since SQL treats grouped aggregation very efficiently, this approach avoids repeated scans and computes everything in a single pass over the filtered dataset.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(T × U) | O(D) | Repeatedly searches users for each trip |
| Optimal | O(T + U) | O(D) | Uses joins and grouped aggregation efficiently |
Here:
Tis the number of tripsUis the number of usersDis the number of distinct days
Algorithm Walkthrough
Optimal SQL Algorithm
- Start with the
Tripstable because every calculation is based on trip records. - Join the
Userstable for the client relationship. This allows us to access whether the client is banned. We join using:
Trips.client_id = Users.users_id
- Join the
Userstable a second time for the driver relationship. Since the same table is used twice, aliases are necessary:
- one alias for clients
- one alias for drivers
- Filter the dataset so only trips with unbanned users remain:
client.banned = 'No'
AND driver.banned = 'No'
- Restrict the date range to:
BETWEEN '2013-10-01' AND '2013-10-03'
- Group all remaining rows by
request_at. This allows us to compute daily statistics independently. - For each group:
- count total valid trips using
COUNT(*) - count canceled trips using a conditional aggregation
- Divide canceled trips by total trips to compute the cancellation rate.
- Round the result to two decimal places using
ROUND(). - Return the day and computed cancellation rate.
Why it works
The algorithm works because every trip is filtered before aggregation. This guarantees that only trips involving unbanned users contribute to the statistics.
Grouping by day partitions the dataset correctly, and conditional aggregation ensures that canceled trips are counted separately from completed trips. Since every valid trip contributes exactly once to the denominator, and every canceled valid trip contributes exactly once to the numerator, the computed ratio is mathematically correct.
Python Solution
Although this is a database problem normally solved with SQL, the following demonstrates the equivalent logic in Python.
from collections import defaultdict
from typing import List, Dict
class Solution:
def cancellationRate(
self,
trips: List[Dict],
users: List[Dict]
) -> List[Dict]:
user_banned = {}
for user in users:
user_banned[user["users_id"]] = user["banned"]
total_trips = defaultdict(int)
cancelled_trips = defaultdict(int)
for trip in trips:
day = trip["request_at"]
if not ("2013-10-01" <= day <= "2013-10-03"):
continue
client_ok = user_banned.get(trip["client_id"]) == "No"
driver_ok = user_banned.get(trip["driver_id"]) == "No"
if not (client_ok and driver_ok):
continue
total_trips[day] += 1
if trip["status"] != "completed":
cancelled_trips[day] += 1
result = []
for day in sorted(total_trips.keys()):
rate = cancelled_trips[day] / total_trips[day]
result.append({
"Day": day,
"Cancellation Rate": round(rate, 2)
})
return result
The implementation begins by building a lookup table from users_id to banned status. This allows constant time access when validating trips.
Two hash maps are then maintained:
total_tripstracks all valid trips per daycancelled_tripstracks canceled valid trips per day
Each trip is processed once. The algorithm first checks whether the date falls inside the required range. It then verifies that both the client and driver are unbanned.
If the trip is valid, the total counter for that day increases. If the trip status is not completed, the canceled counter also increases.
Finally, the cancellation rate is computed for every day and rounded to two decimal places before returning the result.
Go Solution
package main
import (
"sort"
)
type Trip struct {
ID int
ClientID int
DriverID int
CityID int
Status string
RequestAt string
}
type User struct {
UsersID int
Banned string
Role string
}
type Result struct {
Day string
CancellationRate float64
}
func cancellationRate(trips []Trip, users []User) []Result {
userBanned := make(map[int]string)
for _, user := range users {
userBanned[user.UsersID] = user.Banned
}
totalTrips := make(map[string]int)
cancelledTrips := make(map[string]int)
for _, trip := range trips {
day := trip.RequestAt
if day < "2013-10-01" || day > "2013-10-03" {
continue
}
clientOK := userBanned[trip.ClientID] == "No"
driverOK := userBanned[trip.DriverID] == "No"
if !(clientOK && driverOK) {
continue
}
totalTrips[day]++
if trip.Status != "completed" {
cancelledTrips[day]++
}
}
var days []string
for day := range totalTrips {
days = append(days, day)
}
sort.Strings(days)
results := []Result{}
for _, day := range days {
rate := float64(cancelledTrips[day]) / float64(totalTrips[day])
results = append(results, Result{
Day: day,
CancellationRate: float64(int(rate*100+0.5)) / 100,
})
}
return results
}
The Go implementation follows the same logic as the Python version but uses maps explicitly for hash table behavior.
One notable difference is rounding. Go does not have a direct equivalent of Python’s round(..., 2) for decimal formatting in raw float calculations, so manual rounding is used:
float64(int(rate*100+0.5)) / 100
Go also requires explicit type conversion when dividing integers to avoid integer division.
SQL Solution
SELECT
t.request_at AS Day,
ROUND(
AVG(
CASE
WHEN t.status = 'completed' THEN 0
ELSE 1
END
),
2
) AS `Cancellation Rate`
FROM Trips t
JOIN Users c
ON t.client_id = c.users_id
JOIN Users d
ON t.driver_id = d.users_id
WHERE c.banned = 'No'
AND d.banned = 'No'
AND t.request_at BETWEEN '2013-10-01' AND '2013-10-03'
GROUP BY t.request_at;
This solution uses an elegant SQL trick:
completedbecomes0- canceled statuses become
1
Taking the average of these values directly computes:
$$\frac{\text{canceled trips}}{\text{total trips}}$$
This avoids writing separate numerator and denominator calculations.
Worked Examples
Example 1
Initial Validity Check
We first determine which trips are eligible.
| Trip ID | Client Banned | Driver Banned | Valid? |
|---|---|---|---|
| 1 | No | No | Yes |
| 2 | Yes | No | No |
| 3 | No | No | Yes |
| 4 | No | No | Yes |
| 5 | No | No | Yes |
| 6 | Yes | No | No |
| 7 | No | No | Yes |
| 8 | Yes | No | No |
| 9 | No | No | Yes |
| 10 | No | No | Yes |
Day: 2013-10-01
Valid trips:
| Trip ID | Status | Cancelled? |
|---|---|---|
| 1 | completed | No |
| 3 | completed | No |
| 4 | cancelled_by_client | Yes |
State after processing:
| Variable | Value |
|---|---|
| total_trips["2013-10-01"] | 3 |
| cancelled_trips["2013-10-01"] | 1 |
Cancellation rate:
$$\frac{1}{3} = 0.33$$
Day: 2013-10-02
Valid trips:
| Trip ID | Status | Cancelled? |
|---|---|---|
| 5 | completed | No |
| 7 | completed | No |
State after processing:
| Variable | Value |
|---|---|
| total_trips["2013-10-02"] | 2 |
| cancelled_trips["2013-10-02"] | 0 |
Cancellation rate:
$$\frac{0}{2} = 0.00$$
Day: 2013-10-03
Valid trips:
| Trip ID | Status | Cancelled? |
|---|---|---|
| 9 | completed | No |
| 10 | cancelled_by_driver | Yes |
State after processing:
| Variable | Value |
|---|---|
| total_trips["2013-10-03"] | 2 |
| cancelled_trips["2013-10-03"] | 1 |
Cancellation rate:
$$\frac{1}{2} = 0.50$$
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(T + U) | Each user and trip is processed once |
| Space | O(D + U) | Stores user lookup table and per-day counters |
The algorithm performs a single pass over the users table to build the banned lookup map and a single pass over the trips table to compute statistics.
Hash map lookups for banned status are constant time on average, making the overall runtime linear in the input size.
The additional space comes from:
- storing user statuses
- storing aggregated counts per day
Since the number of days is very small in this problem, the dominant factor is the user lookup table.
Test Cases
from collections import defaultdict
def cancellation_rate(trips, users):
user_banned = {
user["users_id"]: user["banned"]
for user in users
}
total = defaultdict(int)
cancelled = defaultdict(int)
for trip in trips:
day = trip["request_at"]
if not ("2013-10-01" <= day <= "2013-10-03"):
continue
if (
user_banned[trip["client_id"]] != "No"
or user_banned[trip["driver_id"]] != "No"
):
continue
total[day] += 1
if trip["status"] != "completed":
cancelled[day] += 1
result = {}
for day in total:
result[day] = round(cancelled[day] / total[day], 2)
return result
users = [
{"users_id": 1, "banned": "No"},
{"users_id": 2, "banned": "Yes"},
{"users_id": 10, "banned": "No"},
]
# Basic completed trip
assert cancellation_rate(
[
{
"client_id": 1,
"driver_id": 10,
"status": "completed",
"request_at": "2013-10-01"
}
],
users
) == {
"2013-10-01": 0.0
} # no cancellations
# Cancelled valid trip
assert cancellation_rate(
[
{
"client_id": 1,
"driver_id": 10,
"status": "cancelled_by_driver",
"request_at": "2013-10-01"
}
],
users
) == {
"2013-10-01": 1.0
} # all trips cancelled
# Banned client ignored
assert cancellation_rate(
[
{
"client_id": 2,
"driver_id": 10,
"status": "cancelled_by_driver",
"request_at": "2013-10-01"
}
],
users
) == {} # invalid trip excluded
# Mixed trips
assert cancellation_rate(
[
{
"client_id": 1,
"driver_id": 10,
"status": "completed",
"request_at": "2013-10-01"
},
{
"client_id": 1,
"driver_id": 10,
"status": "cancelled_by_client",
"request_at": "2013-10-01"
}
],
users
) == {
"2013-10-01": 0.5
} # one out of two cancelled
# Outside date range ignored
assert cancellation_rate(
[
{
"client_id": 1,
"driver_id": 10,
"status": "completed",
"request_at": "2013-09-30"
}
],
users
) == {} # date outside allowed range
# Multiple days
assert cancellation_rate(
[
{
"client_id": 1,
"driver_id": 10,
"status": "completed",
"request_at": "2013-10-01"
},
{
"client_id": 1,
"driver_id": 10,
"status": "cancelled_by_driver",
"request_at": "2013-10-02"
}
],
users
) == {
"2013-10-01": 0.0,
"2013-10-02": 1.0
} # independent daily aggregation
Test Summary
| Test | Why |
|---|---|
| Single completed trip | Verifies zero cancellation rate |
| Single cancelled trip | Verifies full cancellation rate |
| Banned client | Ensures invalid trips are excluded |
| Mixed completed and cancelled trips | Verifies fractional rates |
| Outside date range | Ensures filtering is correct |
| Multiple days | Verifies independent daily grouping |
Edge Cases
Trips Involving Banned Users
One of the most common mistakes is counting trips where either the client or driver is banned. The problem explicitly states that both users must be unbanned.
A buggy implementation might only check the client or only check the driver. The correct solution validates both independently before counting the trip.
Days With No Valid Trips
A day may contain trips, but all of them could involve banned users. In that case, the day should not appear in the output at all.
The implementation naturally handles this because a day is only added to the aggregation maps after a valid trip is processed.
All Trips Cancelled
If every valid trip on a day is canceled, the cancellation rate becomes 1.00.
Some implementations accidentally perform integer division and produce 1 or 0. The provided implementations explicitly use floating point division to avoid this issue.
No Cancelled Trips
If all valid trips are completed, the cancellation rate should be 0.00.
The conditional aggregation correctly produces zero canceled trips while still counting completed trips in the denominator.
Mixed Cancellation Types
Both:
cancelled_by_drivercancelled_by_client
must count as cancellations.
The implementation handles this by treating every non-completed status as canceled, which automatically includes both cases.