LeetCode 2837 - Total Traveled Distance

The problem requires calculating the total distance traveled by each user based on ride data stored in a relational database. We are given two tables: Users and Rides. The Users table contains userid and name, where userid is unique.

LeetCode Problem 2837

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

The problem requires calculating the total distance traveled by each user based on ride data stored in a relational database. We are given two tables: Users and Rides. The Users table contains user_id and name, where user_id is unique. The Rides table contains ride_id, user_id, and distance, where ride_id is unique. Each row in Rides corresponds to a ride taken by a user.

The task is to output a table listing each user_id, name, and the total distance traveled. If a user has not taken any rides, their distance should be 0. The final result should be ordered by user_id in ascending order.

Important observations include that some users may not have rides at all, so we must include all users in the output. This hints at the need for a LEFT JOIN rather than an inner join to avoid losing users without rides. Edge cases could include users with zero rides, users with multiple rides, and users with very large cumulative distances.

Approaches

The brute-force approach would involve iterating through each user and summing up their rides by querying the Rides table individually for each user. While this works and is logically correct, it is inefficient because it requires multiple queries-one per user-which results in O(n * m) time complexity, where n is the number of users and m is the number of rides.

The optimal approach leverages SQL aggregation and joins. By performing a LEFT JOIN from Users to Rides and using SUM(distance), we can compute each user's total distance in a single query. Using COALESCE ensures users without rides get a distance of 0. This is efficient because it processes all data in a single pass with the database engine handling the aggregation.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * m) O(n) Multiple queries per user to sum rides
Optimal O(n + m) O(n) Single query using LEFT JOIN and aggregation

Algorithm Walkthrough

  1. Start with the Users table, as we need every user included in the output.
  2. Perform a LEFT JOIN with the Rides table on user_id. This ensures users with no rides are still retained.
  3. Use the SUM(distance) aggregation to calculate the total distance per user.
  4. Apply COALESCE on the sum of distance to replace NULL with 0 for users with no rides.
  5. Group the results by user_id and name to ensure aggregation is per user.
  6. Order the final output by user_id in ascending order.

Why it works: The LEFT JOIN guarantees all users appear in the output. Aggregating and grouping by user_id correctly sums all rides for each user, while COALESCE ensures users with no rides are not excluded.

Python Solution

# SQL-based solution in Python for LeetCode
class Solution:
    def totalDistance(self) -> str:
        return """
        SELECT 
            u.user_id,
            u.name,
            COALESCE(SUM(r.distance), 0) AS traveled_distance
        FROM Users u
        LEFT JOIN Rides r ON u.user_id = r.user_id
        GROUP BY u.user_id, u.name
        ORDER BY u.user_id ASC
        """

This Python solution returns the SQL query as a string, suitable for LeetCode’s database problems. It follows the algorithm exactly: LEFT JOIN preserves all users, SUM aggregates distances, COALESCE handles nulls, and the GROUP BY ensures totals are per user. The ORDER BY clause sorts the result.

Go Solution

// Go SQL solution typically returns a string for execution
func TotalDistance() string {
    return `
    SELECT 
        u.user_id,
        u.name,
        COALESCE(SUM(r.distance), 0) AS traveled_distance
    FROM Users u
    LEFT JOIN Rides r ON u.user_id = r.user_id
    GROUP BY u.user_id, u.name
    ORDER BY u.user_id ASC
    `
}

In Go, the implementation is nearly identical to Python. Go often returns a string that will be executed in the SQL engine. The syntax and logic remain consistent, and COALESCE ensures null distances are converted to 0.

Worked Examples

For the input:

Users table:

+---------+---------+
| user_id | name    |
+---------+---------+
| 17      | Addison |
| 14      | Ethan   |
| 4       | Michael |
| 2       | Avery   |
| 10      | Eleanor |
+---------+---------+

Rides table:

+---------+---------+----------+
| ride_id | user_id | distance |
+---------+---------+----------+
| 72      | 17      | 160      |
| 42      | 14      | 161      |
| 45      | 4       | 59       |
| 32      | 2       | 197      |
| 15      | 4       | 357      |
| 56      | 2       | 196      |
| 10      | 14      | 25       |
+---------+---------+----------+

Step by step:

  1. LEFT JOIN produces a table with all users and matching rides, with null for missing rides.
  2. SUM(distance) aggregates distances per user:
  • User 2: 197 + 196 = 393
  • User 4: 59 + 357 = 416
  • User 10: NULL → 0
  • User 14: 161 + 25 = 186
  • User 17: 160
  1. COALESCE converts nulls to 0.
  2. ORDER BY user_id results in:
+---------+---------+-------------------+
| user_id | name    | traveled_distance |
+---------+---------+-------------------+
| 2       | Avery   | 393               |
| 4       | Michael | 416               |
| 10      | Eleanor | 0                 |
| 14      | Ethan   | 186               |
| 17      | Addison | 160               |
+---------+---------+-------------------+

Complexity Analysis

Measure Complexity Explanation
Time O(n + m) The LEFT JOIN and aggregation process each user and ride once.
Space O(n) Storing the intermediate aggregated results requires space proportional to the number of users.

The algorithm is linear with respect to the total number of users and rides because each row is processed once, and the database engine handles aggregation efficiently.

Test Cases

# test cases
# Test case 1: Example from the problem statement
assert Solution().totalDistance() == """
SELECT 
    u.user_id,
    u.name,
    COALESCE(SUM(r.distance), 0) AS traveled_distance
FROM Users u
LEFT JOIN Rides r ON u.user_id = r.user_id
GROUP BY u.user_id, u.name
ORDER BY u.user_id ASC
"""  # Validates correct SQL generation

# Test case 2: No rides at all
# Users exist but Rides table is empty, distances should be 0
# Test case 3: All users have exactly one ride
# Test case 4: User with multiple rides
# Test case 5: Large numbers of rides for performance testing
Test Why
Users with rides Validates aggregation sums correctly
Users without rides Validates COALESCE handles nulls
Multiple rides per user Checks that sum aggregation works
Single ride per user Ensures correct handling of minimal data
Empty rides table Ensures no user is omitted

Edge Cases

One important edge case is when a user has no rides. If the SQL used an inner join instead of a left join, that user would be omitted. Using LEFT JOIN with COALESCE ensures the output distance is 0 and the user appears.

Another edge case is users with multiple rides. It is essential that the sum of distances is correctly aggregated per user. Using GROUP BY ensures that all rides are grouped under the correct user_id and name.

A third edge case is an empty Rides table. This scenario tests whether the query correctly handles situations with no ride records. The use of COALESCE(SUM(...), 0) guarantees that users still appear with a distance of 0 even when the rides table is empty.