LeetCode 2687 - Bikes Last Time Used

The problem requires us to identify the last time each bike was used based on a table of rides. Each row of the table corresponds to a unique ride and contains the rideid, bikenumber, starttime, and endtime.

LeetCode Problem 2687

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

The problem requires us to identify the last time each bike was used based on a table of rides. Each row of the table corresponds to a unique ride and contains the ride_id, bike_number, start_time, and end_time. Our task is to produce a result showing the latest end_time for each bike. Additionally, the results must be sorted in descending order of end_time so that bikes used most recently appear first.

The key points to understand are that multiple rides can exist for the same bike, but we are only interested in the most recent end_time. The ride_id column is unique but does not impact our output directly. The start_time is not relevant for finding the last usage, though it can help clarify the chronological ordering if needed. The constraints indicate that all start_time and end_time values are valid datetime entries and that the dataset may be reasonably large, so an efficient solution is preferable. Edge cases include bikes with only one ride and bikes with rides sharing the same end_time.

Approaches

The brute-force approach is to iterate over each unique bike_number, then scan all rows in the Bikes table to identify the maximum end_time for that bike. This approach is correct because it guarantees that we consider all rides for each bike. However, it is inefficient since it could require scanning the full table multiple times, leading to a time complexity of O(n * m), where n is the number of unique bikes and m is the total number of rides.

A more optimal approach is to group rides by bike_number and use an aggregate function to compute the maximum end_time per bike. SQL provides the MAX() function along with GROUP BY to achieve this efficiently. Once we have the latest end_time for each bike, we can order the results in descending order of end_time. This approach works in a single pass over the grouped data and scales linearly with the number of rows.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * m) O(1) Iterates over each bike and scans the full table
Optimal O(m log m) O(k) Uses GROUP BY with MAX() and sorts results, where k is number of bikes

Algorithm Walkthrough

  1. Begin by selecting the bike_number column from the Bikes table.
  2. For each bike_number, use the SQL aggregate function MAX(end_time) to find the most recent ride for that bike. The GROUP BY bike_number clause ensures we calculate the maximum per bike rather than globally.
  3. Sort the resulting rows in descending order of MAX(end_time) so that bikes used most recently appear first.
  4. Return the columns bike_number and end_time with the result of the aggregation and sorting.

Why it works: The MAX() function guarantees the latest datetime for each bike, and GROUP BY ensures that we do not mix rides across different bikes. Sorting by this maximum ensures that the ordering criterion specified in the problem is met.

Python Solution

Since this is a database problem, a Python solution would typically involve an ORM or database query. For demonstration, we can show a pandas-based solution to mimic SQL logic:

import pandas as pd

def bikes_last_time_used(bikes: pd.DataFrame) -> pd.DataFrame:
    result = bikes.groupby('bike_number', as_index=False)['end_time'].max()
    result = result.sort_values(by='end_time', ascending=False)
    return result

This implementation first groups the dataframe by bike_number and calculates the maximum end_time. It then sorts the results in descending order to match the problem requirements. The as_index=False ensures that bike_number remains a column in the output rather than becoming the index.

Go Solution

In Go, you would typically run this as a SQL query rather than processing in native Go code due to the relational nature of the data. Here is an example using a SQL query string:

func bikesLastTimeUsedQuery() string {
    return `
    SELECT bike_number, MAX(end_time) AS end_time
    FROM Bikes
    GROUP BY bike_number
    ORDER BY end_time DESC
    `
}

Go-specific differences are minimal since SQL is used to handle aggregation and sorting. Handling native slices or maps is not necessary because relational aggregation is more efficiently done in the database.

Worked Examples

Example 1: Bikes table

ride_id bike_number start_time end_time
1 W00576 2012-03-25 11:30 2012-03-25 12:40
2 W00300 2012-03-25 10:30 2012-03-25 10:50
3 W00455 2012-03-26 14:30 2012-03-26 17:40
4 W00455 2012-03-25 12:30 2012-03-25 13:40
5 W00576 2012-03-25 08:10 2012-03-25 09:10
6 W00576 2012-03-28 02:30 2012-03-28 02:50

Step by step:

  1. Group by bike_number:
bike_number end_time
W00576 2012-03-28 02:50
W00300 2012-03-25 10:50
W00455 2012-03-26 17:40
  1. Sort by end_time descending:
bike_number end_time
W00576 2012-03-28 02:50
W00455 2012-03-26 17:40
W00300 2012-03-25 10:50

Complexity Analysis

Measure Complexity Explanation
Time O(m log m) Grouping by bike_number is O(m), sorting by end_time is O(k log k) where k <= m
Space O(k) Storing one entry per bike for aggregation

The complexity is dominated by sorting the aggregated results by end_time.

Test Cases

import pandas as pd

# provided example
bikes = pd.DataFrame([
    [1, "W00576", "2012-03-25 11:30:00", "2012-03-25 12:40:00"],
    [2, "W00300", "2012-03-25 10:30:00", "2012-03-25 10:50:00"],
    [3, "W00455", "2012-03-26 14:30:00", "2012-03-26 17:40:00"],
    [4, "W00455", "2012-03-25 12:30:00", "2012-03-25 13:40:00"],
    [5, "W00576", "2012-03-25 08:10:00", "2012-03-25 09:10:00"],
    [6, "W00576", "2012-03-28 02:30:00", "2012-03-28 02:50:00"]
], columns=["ride_id", "bike_number", "start_time", "end_time"])

result = bikes_last_time_used(bikes)
assert result.iloc[0]["bike_number"] == "W00576"  # most recent
assert result.iloc[1]["bike_number"] == "W00455"
assert result.iloc[2]["bike_number"] == "W00300"

# edge case: single ride
bikes_single = pd.DataFrame([[1, "B001", "2022-01-01 00:00:00", "2022-01-01 01:00:00"]],
                            columns=["ride_id", "bike_number", "start_time", "end_time"])
result_single = bikes_last_time_used(bikes_single)
assert result_single.iloc[0]["bike_number"] == "B001"

# edge case: same end_time
bikes_same_time = pd.DataFrame([
    [1, "B001", "2022-01-01 00:00:00", "2022-01-01 01:00:00"],
    [2, "B002", "2022-01-01 00:10:00", "2022-01-01 01:00:00"]
], columns=["ride_id", "bike_number", "start_time", "end_time"])
result_same = bikes_last_time_used(bikes_same_time)
assert set(result_same["bike_number"]) == {"B001", "B002"}
Test Why
provided example validates multiple rides per bike and correct ordering
single ride ensures bikes with only one ride are handled
same end_time ensures sorting handles ties without dropping any bikes