LeetCode 2837 - Total Traveled Distance
The problem requires calculating the total distance traveled by each user based on ride data stored in a relational database. We are given two tables: Users and Rides. The Users table contains userid and name, where userid is unique.
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
The problem requires calculating the total distance traveled by each user based on ride data stored in a relational database. We are given two tables: Users and Rides. The Users table contains user_id and name, where user_id is unique. The Rides table contains ride_id, user_id, and distance, where ride_id is unique. Each row in Rides corresponds to a ride taken by a user.
The task is to output a table listing each user_id, name, and the total distance traveled. If a user has not taken any rides, their distance should be 0. The final result should be ordered by user_id in ascending order.
Important observations include that some users may not have rides at all, so we must include all users in the output. This hints at the need for a LEFT JOIN rather than an inner join to avoid losing users without rides. Edge cases could include users with zero rides, users with multiple rides, and users with very large cumulative distances.
Approaches
The brute-force approach would involve iterating through each user and summing up their rides by querying the Rides table individually for each user. While this works and is logically correct, it is inefficient because it requires multiple queries-one per user-which results in O(n * m) time complexity, where n is the number of users and m is the number of rides.
The optimal approach leverages SQL aggregation and joins. By performing a LEFT JOIN from Users to Rides and using SUM(distance), we can compute each user's total distance in a single query. Using COALESCE ensures users without rides get a distance of 0. This is efficient because it processes all data in a single pass with the database engine handling the aggregation.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n * m) | O(n) | Multiple queries per user to sum rides |
| Optimal | O(n + m) | O(n) | Single query using LEFT JOIN and aggregation |
Algorithm Walkthrough
- Start with the
Userstable, as we need every user included in the output. - Perform a
LEFT JOINwith theRidestable onuser_id. This ensures users with no rides are still retained. - Use the
SUM(distance)aggregation to calculate the total distance per user. - Apply
COALESCEon the sum of distance to replaceNULLwith0for users with no rides. - Group the results by
user_idandnameto ensure aggregation is per user. - Order the final output by
user_idin ascending order.
Why it works: The LEFT JOIN guarantees all users appear in the output. Aggregating and grouping by user_id correctly sums all rides for each user, while COALESCE ensures users with no rides are not excluded.
Python Solution
# SQL-based solution in Python for LeetCode
class Solution:
def totalDistance(self) -> str:
return """
SELECT
u.user_id,
u.name,
COALESCE(SUM(r.distance), 0) AS traveled_distance
FROM Users u
LEFT JOIN Rides r ON u.user_id = r.user_id
GROUP BY u.user_id, u.name
ORDER BY u.user_id ASC
"""
This Python solution returns the SQL query as a string, suitable for LeetCode’s database problems. It follows the algorithm exactly: LEFT JOIN preserves all users, SUM aggregates distances, COALESCE handles nulls, and the GROUP BY ensures totals are per user. The ORDER BY clause sorts the result.
Go Solution
// Go SQL solution typically returns a string for execution
func TotalDistance() string {
return `
SELECT
u.user_id,
u.name,
COALESCE(SUM(r.distance), 0) AS traveled_distance
FROM Users u
LEFT JOIN Rides r ON u.user_id = r.user_id
GROUP BY u.user_id, u.name
ORDER BY u.user_id ASC
`
}
In Go, the implementation is nearly identical to Python. Go often returns a string that will be executed in the SQL engine. The syntax and logic remain consistent, and COALESCE ensures null distances are converted to 0.
Worked Examples
For the input:
Users table:
+---------+---------+
| user_id | name |
+---------+---------+
| 17 | Addison |
| 14 | Ethan |
| 4 | Michael |
| 2 | Avery |
| 10 | Eleanor |
+---------+---------+
Rides table:
+---------+---------+----------+
| ride_id | user_id | distance |
+---------+---------+----------+
| 72 | 17 | 160 |
| 42 | 14 | 161 |
| 45 | 4 | 59 |
| 32 | 2 | 197 |
| 15 | 4 | 357 |
| 56 | 2 | 196 |
| 10 | 14 | 25 |
+---------+---------+----------+
Step by step:
LEFT JOINproduces a table with all users and matching rides, with null for missing rides.SUM(distance)aggregates distances per user:
- User 2: 197 + 196 = 393
- User 4: 59 + 357 = 416
- User 10: NULL → 0
- User 14: 161 + 25 = 186
- User 17: 160
COALESCEconverts nulls to 0.ORDER BY user_idresults in:
+---------+---------+-------------------+
| user_id | name | traveled_distance |
+---------+---------+-------------------+
| 2 | Avery | 393 |
| 4 | Michael | 416 |
| 10 | Eleanor | 0 |
| 14 | Ethan | 186 |
| 17 | Addison | 160 |
+---------+---------+-------------------+
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + m) | The LEFT JOIN and aggregation process each user and ride once. |
| Space | O(n) | Storing the intermediate aggregated results requires space proportional to the number of users. |
The algorithm is linear with respect to the total number of users and rides because each row is processed once, and the database engine handles aggregation efficiently.
Test Cases
# test cases
# Test case 1: Example from the problem statement
assert Solution().totalDistance() == """
SELECT
u.user_id,
u.name,
COALESCE(SUM(r.distance), 0) AS traveled_distance
FROM Users u
LEFT JOIN Rides r ON u.user_id = r.user_id
GROUP BY u.user_id, u.name
ORDER BY u.user_id ASC
""" # Validates correct SQL generation
# Test case 2: No rides at all
# Users exist but Rides table is empty, distances should be 0
# Test case 3: All users have exactly one ride
# Test case 4: User with multiple rides
# Test case 5: Large numbers of rides for performance testing
| Test | Why |
|---|---|
| Users with rides | Validates aggregation sums correctly |
| Users without rides | Validates COALESCE handles nulls |
| Multiple rides per user | Checks that sum aggregation works |
| Single ride per user | Ensures correct handling of minimal data |
| Empty rides table | Ensures no user is omitted |
Edge Cases
One important edge case is when a user has no rides. If the SQL used an inner join instead of a left join, that user would be omitted. Using LEFT JOIN with COALESCE ensures the output distance is 0 and the user appears.
Another edge case is users with multiple rides. It is essential that the sum of distances is correctly aggregated per user. Using GROUP BY ensures that all rides are grouped under the correct user_id and name.
A third edge case is an empty Rides table. This scenario tests whether the query correctly handles situations with no ride records. The use of COALESCE(SUM(...), 0) guarantees that users still appear with a distance of 0 even when the rides table is empty.