LeetCode 2153 - The Number of Passengers in Each Bus II
The problem describes a simulation of passengers arriving at a bus station and buses arriving to pick them up. Each bus has a unique ID, an arrival time, and a limited capacity. Each passenger has a unique ID and an arrival time.
Difficulty: 🔴 Hard
Topics: Database
Solution
Problem Understanding
The problem describes a simulation of passengers arriving at a bus station and buses arriving to pick them up. Each bus has a unique ID, an arrival time, and a limited capacity. Each passenger has a unique ID and an arrival time. Passengers who have arrived at or before the arrival of a bus, and who have not already boarded a previous bus, will attempt to board the bus. If the number of waiting passengers exceeds the bus's capacity, only the first capacity passengers in chronological order of arrival will board.
The input consists of two tables: Buses and Passengers. The Buses table provides bus IDs, arrival times, and capacities. The Passengers table provides passenger IDs and arrival times. The output should be a table listing each bus_id with the number of passengers that successfully boarded it, sorted in ascending order by bus_id.
Key points to notice: all bus IDs and passenger IDs are unique, bus arrival times are unique, capacities are positive integers, and no two buses arrive at the same time. Important edge cases include buses with zero waiting passengers, passengers arriving after all buses have departed, and buses with smaller capacity than the number of waiting passengers.
Approaches
The brute-force approach would simulate the scenario directly. For each bus, we would iterate over all passengers and select those who have arrived by the bus's arrival time and have not yet boarded a bus. After selecting the passengers, we would limit the boarding to the bus's capacity. This works but is inefficient, especially if there are thousands of passengers and buses, because for each bus we iterate through all passengers, leading to a potential O(n*m) complexity, where n is the number of buses and m is the number of passengers.
A more optimal solution relies on sorting both the buses and passengers by their arrival times. By using two pointers, one for buses and one for passengers, we can iterate through the passengers in chronological order and match them to buses efficiently. This ensures each passenger is considered exactly once and each bus processes only the relevant subset of passengers, reducing unnecessary comparisons.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n*m) | O(n) | Iterate each bus and filter all passengers who arrived before it |
| Optimal | O(n log n + m log m) | O(n + m) | Sort buses and passengers, then use two pointers to simulate boarding |
Algorithm Walkthrough
-
Sort the
Busestable byarrival_timeand thePassengerstable byarrival_time. -
Initialize a pointer
pto iterate through passengers and an empty result list. -
For each bus in chronological order:
-
Initialize a counter
boardedto 0. -
While
ppoints to a passenger who has arrived at or before the current bus's arrival andboardedis less than the bus capacity: -
Increment
boardedby 1. -
Move the passenger pointer
pto the next passenger. -
Record the
bus_idandboardedcount in the result list. -
After processing all buses, sort the result list by
bus_idto produce the final output.
Why it works: The algorithm guarantees passengers board the earliest bus possible because we process buses in chronological order and only move the passenger pointer forward. Each bus respects its capacity limit, and each passenger boards at most one bus.
Python Solution
import pandas as pd
class Solution:
def numberOfPassengers(self, Buses: pd.DataFrame, Passengers: pd.DataFrame) -> pd.DataFrame:
# Sort buses and passengers by arrival time
buses_sorted = Buses.sort_values('arrival_time').reset_index(drop=True)
passengers_sorted = Passengers.sort_values('arrival_time').reset_index(drop=True)
p = 0
total_passengers = len(passengers_sorted)
result = []
for _, bus in buses_sorted.iterrows():
capacity = bus['capacity']
boarded = 0
# Board passengers until capacity is filled or no eligible passengers left
while p < total_passengers and passengers_sorted.at[p, 'arrival_time'] <= bus['arrival_time'] and boarded < capacity:
boarded += 1
p += 1
result.append({'bus_id': bus['bus_id'], 'passengers_cnt': boarded})
# Return result sorted by bus_id
return pd.DataFrame(result).sort_values('bus_id').reset_index(drop=True)
This implementation first sorts both tables by arrival times to make chronological boarding straightforward. We use a pointer p to track the next passenger to consider, incrementing it as passengers board buses. For each bus, we count how many passengers board without exceeding its capacity. The final result is sorted by bus_id as required.
Go Solution
package main
import (
"sort"
)
type Bus struct {
BusID int
ArrivalTime int
Capacity int
}
type Passenger struct {
PassengerID int
ArrivalTime int
}
type BusResult struct {
BusID int
PassengersCnt int
}
func NumberOfPassengers(buses []Bus, passengers []Passenger) []BusResult {
sort.Slice(buses, func(i, j int) bool { return buses[i].ArrivalTime < buses[j].ArrivalTime })
sort.Slice(passengers, func(i, j int) bool { return passengers[i].ArrivalTime < passengers[j].ArrivalTime })
result := []BusResult{}
p := 0
totalPassengers := len(passengers)
for _, bus := range buses {
boarded := 0
for p < totalPassengers && passengers[p].ArrivalTime <= bus.ArrivalTime && boarded < bus.Capacity {
boarded++
p++
}
result = append(result, BusResult{BusID: bus.BusID, PassengersCnt: boarded})
}
// Sort by BusID to match the required output order
sort.Slice(result, func(i, j int) bool { return result[i].BusID < result[j].BusID })
return result
}
The Go implementation mirrors the Python version. We sort the slices of buses and passengers by arrival time, iterate with a pointer for passengers, and count boarded passengers for each bus. Go requires explicit slice manipulation and struct definitions for clarity and type safety.
Worked Examples
Using the example from the problem statement:
| Bus arrival_time | Bus capacity | Passenger arrival_time | Boarded |
|---|---|---|---|
| 2 | 1 | 1 | 1 |
| 4 | 10 | 1, 12 | 1 |
| 7 | 2 | 5, 6, 7 | 2 |
Step-by-step:
- Sort buses: [2, 4, 7], passengers: [1, 1, 5, 6, 7].
- Bus 2: capacity 1, board passenger 11 → boarded=1, p=1.
- Bus 4: capacity 10, board passenger 12 → boarded=1, p=2.
- Bus 7: capacity 2, board passengers 13 and 14 → boarded=2, p=4.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n + m log m) | Sorting buses and passengers dominates the time complexity |
| Space | O(n + m) | We store sorted copies and results |
Sorting is the main cost, while the two-pointer iteration runs in O(n + m), which is linear in the total number of buses and passengers.
Test Cases
# Provided example
Buses = pd.DataFrame({'bus_id':[1,2,3],'arrival_time':[2,4,7],'capacity':[1,10,2]})
Passengers = pd.DataFrame({'passenger_id':[11,12,13,14,15],'arrival_time':[1,1,5,6,7]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == [[1,1],[2,1],[3,2]]
# No passengers
Buses = pd.DataFrame({'bus_id':[1],'arrival_time':[5],'capacity':[2]})
Passengers = pd.DataFrame({'passenger_id':[],'arrival_time':[]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == [[1,0]]
# No buses
Buses = pd.DataFrame({'bus_id':[],'arrival_time':[],'capacity':[]})
Passengers = pd.DataFrame({'passenger_id':[1],'arrival_time':[1]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == []
# Bus capacity smaller than waiting passengers
Buses = pd.DataFrame({'bus_id':[1],'arrival_time':[5],'capacity':[1]})
Passengers = pd.DataFrame({'passenger_id':[1,2,3],'arrival_time':[1,2,3]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == [[1,1]]
# Multiple buses, passengers arriving exactly at bus arrival
Buses = pd.DataFrame({'bus_id':[1,2],'arrival_time':[2,4],'capacity':[2,1]})
Passengers = pd.DataFrame({'passenger_id':[1,2,3],'arrival_time':[2,3,4]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == [[1,2],[2,1