LeetCode 1396 - Design Underground System
This problem asks us to design a small tracking system for an underground railway network. The system must support three
Difficulty: 🟡 Medium
Topics: Hash Table, String, Design
Solution
Problem Understanding
This problem asks us to design a small tracking system for an underground railway network. The system must support three operations:
- Recording when a customer checks into a station.
- Recording when a customer checks out from a station.
- Calculating the average travel time between two stations.
Each customer is identified by a unique integer id. When a customer checks in, we are given the station name and the check-in time. Later, when that same customer checks out, we are given the destination station and checkout time.
A trip is defined as a direct journey from one station to another. For example, if someone checks in at "Leyton" and checks out at "Waterloo", that counts as one "Leyton" -> "Waterloo" trip.
The key requirement is that getAverageTime(startStation, endStation) must return the average duration across all completed trips for that exact route.
The problem guarantees that all operations are valid and occur in chronological order. This means:
- A user will never check out before checking in.
- A user will never be checked into multiple stations simultaneously.
- Every
getAverageTimequery refers to at least one completed trip.
These guarantees simplify the implementation because we do not need defensive validation logic.
The constraints are relatively small, with at most 2 * 10^4 total operations. Even though the limits are not extremely large, the problem is fundamentally about designing efficient state management using appropriate data structures.
The most important observation is that we do not need to store every historical trip individually. To compute averages, we only need:
- The total accumulated travel time for a route.
- The number of completed trips for that route.
Important edge cases include:
- Multiple users traveling simultaneously.
- Multiple trips between the same pair of stations.
- Routes that exist in both directions, since
"A" -> "B"and"B" -> "A"are different routes. - Calling
getAverageTimerepeatedly before new trips occur. - Users checking in and out interleaved with other users.
A naive implementation can easily become inefficient if it repeatedly scans all trips whenever an average is requested.
Approaches
Brute Force Approach
A straightforward solution is to store every completed trip individually.
We can maintain:
- A map from customer ID to active check-in information.
- Another map from
(startStation, endStation)to a list of trip durations.
When a customer checks out:
- Retrieve their check-in data.
- Compute the travel duration.
- Append the duration to the route's list.
When getAverageTime is called:
- Retrieve the list of durations for the route.
- Sum all durations.
- Divide by the number of trips.
This approach is correct because every trip duration is explicitly stored, so the average can always be computed accurately.
However, it is inefficient because every average query requires iterating through all trips for that route. If many trips exist, repeated average calculations become expensive.
Optimal Approach
The key insight is that averages do not require storing every individual trip.
To compute an average, we only need:
- The total sum of all travel times.
- The total number of trips.
Using this observation, we can maintain:
-
A hash map of active check-ins:
-
id -> (stationName, checkInTime) -
A hash map of route statistics:
-
(startStation, endStation) -> (totalTime, tripCount)
When a customer checks out:
- Retrieve their check-in information.
- Compute the trip duration.
- Update the route's cumulative total time.
- Increment the route's trip count.
Now getAverageTime becomes extremely efficient because the answer is simply:
average = totalTime / tripCount
No iteration over historical trips is necessary.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) per average query | O(n) | Stores every trip duration individually |
| Optimal | O(1) per operation | O(n) | Stores only aggregate statistics |
Algorithm Walkthrough
- Create a hash map called
check_ins.
This map stores currently active passengers. The key is the passenger ID, and the value contains:
- The station where they checked in
- The check-in time
Example:
45 -> ("Leyton", 3)
- Create another hash map called
route_stats.
This map stores aggregate statistics for every route.
The key is a pair:
(startStation, endStation)
The value contains:
- Total accumulated travel time
- Total number of completed trips
Example:
("Leyton", "Waterloo") -> (22, 2)
- When
checkIn(id, stationName, t)is called:
Store the passenger's current trip information in check_ins.
This allows us to later match the checkout event with the corresponding check-in event.
4. When checkOut(id, stationName, t) is called:
Retrieve the passenger's check-in data from check_ins.
Compute the trip duration:
duration = checkoutTime - checkinTime
Construct the route key:
(startStation, endStation)
Update the cumulative route statistics:
- Add the duration to total travel time
- Increment the trip count
Remove the passenger from check_ins because their journey is complete.
5. When getAverageTime(startStation, endStation) is called:
Retrieve the stored statistics for the route.
Compute:
totalTime / tripCount
Return the result as a floating-point value.
Why it works
The algorithm works because every completed trip contributes exactly once to the corresponding route statistics.
For every route, we maintain two invariant values:
- The sum of all completed trip durations
- The count of completed trips
Since:
average = totalSum / totalCount
the returned value is always the correct average travel time.
The check_ins map guarantees that each checkout operation can correctly recover the matching check-in information.
Python Solution
class UndergroundSystem:
def __init__(self):
# id -> (station_name, check_in_time)
self.check_ins = {}
# (start_station, end_station) -> [total_time, trip_count]
self.route_stats = {}
def checkIn(self, id: int, stationName: str, t: int) -> None:
self.check_ins[id] = (stationName, t)
def checkOut(self, id: int, stationName: str, t: int) -> None:
start_station, check_in_time = self.check_ins[id]
duration = t - check_in_time
route = (start_station, stationName)
if route not in self.route_stats:
self.route_stats[route] = [0, 0]
self.route_stats[route][0] += duration
self.route_stats[route][1] += 1
del self.check_ins[id]
def getAverageTime(self, startStation: str, endStation: str) -> float:
total_time, trip_count = self.route_stats[(startStation, endStation)]
return total_time / trip_count
# Your UndergroundSystem object will be instantiated and called as such:
# obj = UndergroundSystem()
# obj.checkIn(id, stationName, t)
# obj.checkOut(id, stationName, t)
# param_3 = obj.getAverageTime(startStation, endStation)
The implementation uses two dictionaries.
The check_ins dictionary tracks all active passengers currently traveling in the system. Each passenger ID maps to the station and timestamp where the trip began.
The route_stats dictionary stores aggregate information for each route. Instead of storing every trip individually, we only keep the cumulative travel time and trip count.
During checkout, the code retrieves the original check-in information, computes the duration, and updates the corresponding route statistics.
The average query becomes extremely efficient because the result is directly computed from stored aggregates without scanning historical trips.
Removing the passenger from check_ins after checkout ensures that the active-trip state remains accurate.
Go Solution
package main
type CheckInData struct {
station string
time int
}
type RouteStats struct {
totalTime int
tripCount int
}
type UndergroundSystem struct {
checkIns map[int]CheckInData
routeData map[string]RouteStats
}
func Constructor() UndergroundSystem {
return UndergroundSystem{
checkIns: make(map[int]CheckInData),
routeData: make(map[string]RouteStats),
}
}
func (this *UndergroundSystem) CheckIn(id int, stationName string, t int) {
this.checkIns[id] = CheckInData{
station: stationName,
time: t,
}
}
func (this *UndergroundSystem) CheckOut(id int, stationName string, t int) {
checkIn := this.checkIns[id]
duration := t - checkIn.time
routeKey := checkIn.station + "->" + stationName
stats := this.routeData[routeKey]
stats.totalTime += duration
stats.tripCount++
this.routeData[routeKey] = stats
delete(this.checkIns, id)
}
func (this *UndergroundSystem) GetAverageTime(startStation string, endStation string) float64 {
routeKey := startStation + "->" + endStation
stats := this.routeData[routeKey]
return float64(stats.totalTime) / float64(stats.tripCount)
}
/**
* Your UndergroundSystem object will be instantiated and called as such:
* obj := Constructor()
* obj.CheckIn(id, stationName, t)
* obj.CheckOut(id, stationName, t)
* param_3 := obj.GetAverageTime(startStation, endStation)
*/
The Go implementation follows the same logic as the Python version but uses structs and typed maps.
Since Go does not support tuple keys as conveniently as Python, the route key is represented as a concatenated string:
startStation + "->" + endStation
Go maps return zero-value structs automatically for missing keys, which simplifies updating route statistics.
The average calculation explicitly converts integers to float64 to avoid integer division.
Worked Examples
Example 1
Operations:
checkIn(45, "Leyton", 3)
checkIn(32, "Paradise", 8)
checkIn(27, "Leyton", 10)
Current check_ins:
| ID | Station | Time |
|---|---|---|
| 45 | Leyton | 3 |
| 32 | Paradise | 8 |
| 27 | Leyton | 10 |
route_stats is still empty because nobody has checked out yet.
Now:
checkOut(45, "Waterloo", 15)
Duration:
15 - 3 = 12
Updated route_stats:
| Route | Total Time | Trip Count |
|---|---|---|
| Leyton -> Waterloo | 12 | 1 |
Next:
checkOut(27, "Waterloo", 20)
Duration:
20 - 10 = 10
Updated statistics:
| Route | Total Time | Trip Count |
|---|---|---|
| Leyton -> Waterloo | 22 | 2 |
Next:
checkOut(32, "Cambridge", 22)
Duration:
22 - 8 = 14
Updated statistics:
| Route | Total Time | Trip Count |
|---|---|---|
| Leyton -> Waterloo | 22 | 2 |
| Paradise -> Cambridge | 14 | 1 |
Now:
getAverageTime("Paradise", "Cambridge")
Calculation:
14 / 1 = 14
Next:
getAverageTime("Leyton", "Waterloo")
Calculation:
22 / 2 = 11
Now another trip:
checkIn(10, "Leyton", 24)
checkOut(10, "Waterloo", 38)
Duration:
38 - 24 = 14
Updated statistics:
| Route | Total Time | Trip Count |
|---|---|---|
| Leyton -> Waterloo | 36 | 3 |
Final query:
36 / 3 = 12
Example 2
Initial trip:
checkIn(10, "Leyton", 3)
checkOut(10, "Paradise", 8)
Duration:
5
Statistics:
| Route | Total Time | Trip Count |
|---|---|---|
| Leyton -> Paradise | 5 | 1 |
Average:
5 / 1 = 5
Second trip:
checkIn(5, "Leyton", 10)
checkOut(5, "Paradise", 16)
Duration:
6
Updated statistics:
| Route | Total Time | Trip Count |
|---|---|---|
| Leyton -> Paradise | 11 | 2 |
Average:
11 / 2 = 5.5
Third trip:
checkIn(2, "Leyton", 21)
checkOut(2, "Paradise", 30)
Duration:
9
Updated statistics:
| Route | Total Time | Trip Count |
|---|---|---|
| Leyton -> Paradise | 20 | 3 |
Average:
20 / 3 = 6.66667
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(1) | Every operation performs only constant-time hash map operations |
| Space | O(n) | Stores active passengers and route statistics |
The time complexity is constant for all three operations because hash map insertion, lookup, and update operations are average-case O(1).
The space complexity grows with:
- The number of active passengers currently checked in.
- The number of distinct station-to-station routes.
In the worst case, both can grow proportionally to the number of operations.
Test Cases
# Example 1
system = UndergroundSystem()
system.checkIn(45, "Leyton", 3)
system.checkIn(32, "Paradise", 8)
system.checkIn(27, "Leyton", 10)
system.checkOut(45, "Waterloo", 15)
system.checkOut(27, "Waterloo", 20)
system.checkOut(32, "Cambridge", 22)
assert system.getAverageTime("Paradise", "Cambridge") == 14.0 # single trip
assert system.getAverageTime("Leyton", "Waterloo") == 11.0 # average of 12 and 10
system.checkIn(10, "Leyton", 24)
assert system.getAverageTime("Leyton", "Waterloo") == 11.0 # unchanged before checkout
system.checkOut(10, "Waterloo", 38)
assert system.getAverageTime("Leyton", "Waterloo") == 12.0 # average of 12, 10, 14
# Example 2
system = UndergroundSystem()
system.checkIn(10, "Leyton", 3)
system.checkOut(10, "Paradise", 8)
assert system.getAverageTime("Leyton", "Paradise") == 5.0 # one trip
system.checkIn(5, "Leyton", 10)
system.checkOut(5, "Paradise", 16)
assert system.getAverageTime("Leyton", "Paradise") == 5.5 # average of 5 and 6
system.checkIn(2, "Leyton", 21)
system.checkOut(2, "Paradise", 30)
assert abs(system.getAverageTime("Leyton", "Paradise") - 6.66667) < 1e-5 # floating-point average
# Different directions treated separately
system = UndergroundSystem()
system.checkIn(1, "A", 1)
system.checkOut(1, "B", 6)
system.checkIn(2, "B", 10)
system.checkOut(2, "A", 20)
assert system.getAverageTime("A", "B") == 5.0 # A -> B
assert system.getAverageTime("B", "A") == 10.0 # B -> A
# Multiple simultaneous passengers
system = UndergroundSystem()
system.checkIn(1, "X", 1)
system.checkIn(2, "X", 2)
system.checkIn(3, "X", 3)
system.checkOut(1, "Y", 11)
system.checkOut(2, "Y", 12)
system.checkOut(3, "Y", 13)
assert system.getAverageTime("X", "Y") == 10.0 # all trips duration 10
# Single passenger multiple trips
system = UndergroundSystem()
system.checkIn(1, "A", 5)
system.checkOut(1, "B", 15)
system.checkIn(1, "A", 20)
system.checkOut(1, "B", 40)
assert system.getAverageTime("A", "B") == 15.0 # average of 10 and 20
# Minimum travel duration
system = UndergroundSystem()
system.checkIn(1, "S", 1)
system.checkOut(1, "T", 2)
assert system.getAverageTime("S", "T") == 1.0 # smallest valid duration
| Test | Why |
|---|---|
| Example 1 | Verifies the full workflow and cumulative averaging |
| Example 2 | Validates floating-point averages |
| Different directions | Confirms routes are directional |
| Multiple simultaneous passengers | Ensures active trips do not interfere |
| Single passenger multiple trips | Verifies repeated reuse of customer IDs |
| Minimum travel duration | Tests smallest valid time difference |
Edge Cases
One important edge case is when multiple users are traveling simultaneously. A flawed implementation might overwrite or confuse check-in records between passengers. This solution avoids that problem because every active trip is keyed by unique customer ID in the check_ins dictionary.
Another critical edge case is directional routes. The average travel time from "A" to "B" is not necessarily the same as "B" to "A". A buggy implementation might accidentally combine both directions into the same statistics bucket. This implementation correctly treats (startStation, endStation) as an ordered pair, so both routes remain independent.
A third important edge case involves repeated average queries before new trips occur. The result should remain unchanged until additional completed trips are added. Since the implementation stores cumulative totals rather than recomputing from raw data each time, repeated calls return stable and correct results efficiently.
Another subtle case is reusing the same passenger ID for multiple trips. After checkout, the old check-in record must be removed. Otherwise, future trips could accidentally use stale information. The implementation explicitly deletes the passenger from check_ins after processing checkout, ensuring that each trip is handled independently.