LeetCode 2153 - The Number of Passengers in Each Bus II

The problem describes a simulation of passengers arriving at a bus station and buses arriving to pick them up. Each bus has a unique ID, an arrival time, and a limited capacity. Each passenger has a unique ID and an arrival time.

LeetCode Problem 2153

Difficulty: 🔴 Hard
Topics: Database

Solution

Problem Understanding

The problem describes a simulation of passengers arriving at a bus station and buses arriving to pick them up. Each bus has a unique ID, an arrival time, and a limited capacity. Each passenger has a unique ID and an arrival time. Passengers who have arrived at or before the arrival of a bus, and who have not already boarded a previous bus, will attempt to board the bus. If the number of waiting passengers exceeds the bus's capacity, only the first capacity passengers in chronological order of arrival will board.

The input consists of two tables: Buses and Passengers. The Buses table provides bus IDs, arrival times, and capacities. The Passengers table provides passenger IDs and arrival times. The output should be a table listing each bus_id with the number of passengers that successfully boarded it, sorted in ascending order by bus_id.

Key points to notice: all bus IDs and passenger IDs are unique, bus arrival times are unique, capacities are positive integers, and no two buses arrive at the same time. Important edge cases include buses with zero waiting passengers, passengers arriving after all buses have departed, and buses with smaller capacity than the number of waiting passengers.

Approaches

The brute-force approach would simulate the scenario directly. For each bus, we would iterate over all passengers and select those who have arrived by the bus's arrival time and have not yet boarded a bus. After selecting the passengers, we would limit the boarding to the bus's capacity. This works but is inefficient, especially if there are thousands of passengers and buses, because for each bus we iterate through all passengers, leading to a potential O(n*m) complexity, where n is the number of buses and m is the number of passengers.

A more optimal solution relies on sorting both the buses and passengers by their arrival times. By using two pointers, one for buses and one for passengers, we can iterate through the passengers in chronological order and match them to buses efficiently. This ensures each passenger is considered exactly once and each bus processes only the relevant subset of passengers, reducing unnecessary comparisons.

Approach Time Complexity Space Complexity Notes
Brute Force O(n*m) O(n) Iterate each bus and filter all passengers who arrived before it
Optimal O(n log n + m log m) O(n + m) Sort buses and passengers, then use two pointers to simulate boarding

Algorithm Walkthrough

  1. Sort the Buses table by arrival_time and the Passengers table by arrival_time.

  2. Initialize a pointer p to iterate through passengers and an empty result list.

  3. For each bus in chronological order:

  4. Initialize a counter boarded to 0.

  5. While p points to a passenger who has arrived at or before the current bus's arrival and boarded is less than the bus capacity:

  6. Increment boarded by 1.

  7. Move the passenger pointer p to the next passenger.

  8. Record the bus_id and boarded count in the result list.

  9. After processing all buses, sort the result list by bus_id to produce the final output.

Why it works: The algorithm guarantees passengers board the earliest bus possible because we process buses in chronological order and only move the passenger pointer forward. Each bus respects its capacity limit, and each passenger boards at most one bus.

Python Solution

import pandas as pd

class Solution:
    def numberOfPassengers(self, Buses: pd.DataFrame, Passengers: pd.DataFrame) -> pd.DataFrame:
        # Sort buses and passengers by arrival time
        buses_sorted = Buses.sort_values('arrival_time').reset_index(drop=True)
        passengers_sorted = Passengers.sort_values('arrival_time').reset_index(drop=True)
        
        p = 0
        total_passengers = len(passengers_sorted)
        result = []

        for _, bus in buses_sorted.iterrows():
            capacity = bus['capacity']
            boarded = 0
            # Board passengers until capacity is filled or no eligible passengers left
            while p < total_passengers and passengers_sorted.at[p, 'arrival_time'] <= bus['arrival_time'] and boarded < capacity:
                boarded += 1
                p += 1
            result.append({'bus_id': bus['bus_id'], 'passengers_cnt': boarded})

        # Return result sorted by bus_id
        return pd.DataFrame(result).sort_values('bus_id').reset_index(drop=True)

This implementation first sorts both tables by arrival times to make chronological boarding straightforward. We use a pointer p to track the next passenger to consider, incrementing it as passengers board buses. For each bus, we count how many passengers board without exceeding its capacity. The final result is sorted by bus_id as required.

Go Solution

package main

import (
    "sort"
)

type Bus struct {
    BusID       int
    ArrivalTime int
    Capacity    int
}

type Passenger struct {
    PassengerID int
    ArrivalTime int
}

type BusResult struct {
    BusID          int
    PassengersCnt  int
}

func NumberOfPassengers(buses []Bus, passengers []Passenger) []BusResult {
    sort.Slice(buses, func(i, j int) bool { return buses[i].ArrivalTime < buses[j].ArrivalTime })
    sort.Slice(passengers, func(i, j int) bool { return passengers[i].ArrivalTime < passengers[j].ArrivalTime })

    result := []BusResult{}
    p := 0
    totalPassengers := len(passengers)

    for _, bus := range buses {
        boarded := 0
        for p < totalPassengers && passengers[p].ArrivalTime <= bus.ArrivalTime && boarded < bus.Capacity {
            boarded++
            p++
        }
        result = append(result, BusResult{BusID: bus.BusID, PassengersCnt: boarded})
    }

    // Sort by BusID to match the required output order
    sort.Slice(result, func(i, j int) bool { return result[i].BusID < result[j].BusID })
    return result
}

The Go implementation mirrors the Python version. We sort the slices of buses and passengers by arrival time, iterate with a pointer for passengers, and count boarded passengers for each bus. Go requires explicit slice manipulation and struct definitions for clarity and type safety.

Worked Examples

Using the example from the problem statement:

Bus arrival_time Bus capacity Passenger arrival_time Boarded
2 1 1 1
4 10 1, 12 1
7 2 5, 6, 7 2

Step-by-step:

  1. Sort buses: [2, 4, 7], passengers: [1, 1, 5, 6, 7].
  2. Bus 2: capacity 1, board passenger 11 → boarded=1, p=1.
  3. Bus 4: capacity 10, board passenger 12 → boarded=1, p=2.
  4. Bus 7: capacity 2, board passengers 13 and 14 → boarded=2, p=4.

Complexity Analysis

Measure Complexity Explanation
Time O(n log n + m log m) Sorting buses and passengers dominates the time complexity
Space O(n + m) We store sorted copies and results

Sorting is the main cost, while the two-pointer iteration runs in O(n + m), which is linear in the total number of buses and passengers.

Test Cases

# Provided example
Buses = pd.DataFrame({'bus_id':[1,2,3],'arrival_time':[2,4,7],'capacity':[1,10,2]})
Passengers = pd.DataFrame({'passenger_id':[11,12,13,14,15],'arrival_time':[1,1,5,6,7]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == [[1,1],[2,1],[3,2]]

# No passengers
Buses = pd.DataFrame({'bus_id':[1],'arrival_time':[5],'capacity':[2]})
Passengers = pd.DataFrame({'passenger_id':[],'arrival_time':[]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == [[1,0]]

# No buses
Buses = pd.DataFrame({'bus_id':[],'arrival_time':[],'capacity':[]})
Passengers = pd.DataFrame({'passenger_id':[1],'arrival_time':[1]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == []

# Bus capacity smaller than waiting passengers
Buses = pd.DataFrame({'bus_id':[1],'arrival_time':[5],'capacity':[1]})
Passengers = pd.DataFrame({'passenger_id':[1,2,3],'arrival_time':[1,2,3]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == [[1,1]]

# Multiple buses, passengers arriving exactly at bus arrival
Buses = pd.DataFrame({'bus_id':[1,2],'arrival_time':[2,4],'capacity':[2,1]})
Passengers = pd.DataFrame({'passenger_id':[1,2,3],'arrival_time':[2,3,4]})
assert Solution().numberOfPassengers(Buses, Passengers).values.tolist() == [[1,2],[2,1