LeetCode 2388 - Change Null Values in a Table to the Previous Value
The problem asks us to process a database table, CoffeeShop, which contains two columns: id and drink. Each row represents a drink order. Some drink values may be NULL.
Difficulty: 🟡 Medium
Topics: Database
Solution
Problem Understanding
The problem asks us to process a database table, CoffeeShop, which contains two columns: id and drink. Each row represents a drink order. Some drink values may be NULL. The task is to replace every NULL in the drink column with the most recent non-NULL value from the previous rows, preserving the order of rows as given by the id sequence in the table.
In other words, we are performing a forward-fill operation on the drink column, similar to what is often done in time series data to fill missing values. The problem guarantees that the first row always has a non-NULL value, so we never have to handle a NULL without a previous value to reference.
The expected output is the same table, with all NULL values replaced by the most recent previous non-NULL drink, keeping the original row order intact.
Key constraints and considerations are that the table can have arbitrary ordering of ids, so a simple sequential scan is required, and we must handle consecutive NULLs correctly.
Important edge cases include consecutive NULLs, the last row being NULL, and minimal tables (one or two rows). The guarantee that the first row is not NULL simplifies our handling.
Approaches
A brute-force approach would iterate over each row, and for every NULL encountered, perform another backward scan to find the nearest previous non-NULL drink. This approach is correct because it ensures every NULL is replaced with the immediate prior value, but it is inefficient because it could require scanning almost the entire table for every NULL, resulting in a time complexity of O(n²) in the worst case.
The optimal approach leverages a forward pass using a simple variable to keep track of the last non-NULL drink encountered. As we iterate over the rows in order, if a row's drink is not NULL, we update this variable. If it is NULL, we replace it immediately with the stored last non-NULL drink. This avoids repeated backward scanning and reduces the time complexity to O(n) with constant extra space.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | For each NULL, scan backwards to find previous value |
| Optimal | O(n) | O(1) | Forward scan storing last non-NULL value |
Algorithm Walkthrough
- Initialize a variable
last_drinktoNULL. This variable will track the most recent non-NULLdrink as we scan. - Iterate over each row of the
CoffeeShoptable in order of appearance (id order is preserved in input). - For each row, check if the
drinkvalue isNULL. - If the
drinkis notNULL, updatelast_drinkto this value. - If the
drinkisNULL, replace it with the value stored inlast_drink. - Continue until all rows have been processed.
- Return the modified table, preserving the original order.
Why it works: The forward-scan invariant ensures that at every step, last_drink contains the most recent non-NULL value seen so far. This guarantees that every NULL is replaced with the correct preceding value without needing any backward scanning.
Python Solution
import pandas as pd
def change_nulls(coffee_shop: pd.DataFrame) -> pd.DataFrame:
last_drink = None
drinks = coffee_shop['drink'].tolist()
for i in range(len(drinks)):
if drinks[i] is not None:
last_drink = drinks[i]
else:
drinks[i] = last_drink
coffee_shop['drink'] = drinks
return coffee_shop
In this implementation, we first convert the drink column to a list to allow in-place modification. We iterate through the list, updating last_drink whenever we encounter a non-NULL value. When a NULL is found, it is replaced immediately. Finally, we update the DataFrame column and return the result.
Go Solution
package main
type CoffeeShop struct {
Id int
Drink *string
}
func changeNulls(coffeeShop []CoffeeShop) []CoffeeShop {
var lastDrink string
for i := 0; i < len(coffeeShop); i++ {
if coffeeShop[i].Drink != nil {
lastDrink = *coffeeShop[i].Drink
} else {
coffeeShop[i].Drink = &lastDrink
}
}
return coffeeShop
}
In Go, we use a slice of CoffeeShop structs. The Drink field is a pointer to string to accommodate NULL values. As we iterate, we update the lastDrink variable when a non-nil value is encountered and replace nil pointers with a reference to lastDrink for NULL entries.
Worked Examples
For the input:
+----+-------------------+
| id | drink |
+----+-------------------+
| 9 | Rum and Coke |
| 6 | null |
| 7 | null |
| 3 | St Germain Spritz |
| 1 | Orange Margarita |
| 2 | null |
+----+-------------------+
Step by step state of last_drink:
| Row | Original drink | last_drink | Updated drink |
|---|---|---|---|
| 9 | Rum and Coke | Rum and Coke | Rum and Coke |
| 6 | null | Rum and Coke | Rum and Coke |
| 7 | null | Rum and Coke | Rum and Coke |
| 3 | St Germain Spritz | St Germain Spritz | St Germain Spritz |
| 1 | Orange Margarita | Orange Margarita | Orange Margarita |
| 2 | null | Orange Margarita | Orange Margarita |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Single pass through all rows, each row processed once |
| Space | O(1) | Only one variable last_drink is used for tracking |
The time complexity is linear because each row is visited exactly once. Space complexity is constant as we only store a single variable for the last non-NULL drink.
Test Cases
import pandas as pd
# Basic provided example
df = pd.DataFrame({
'id': [9,6,7,3,1,2],
'drink': ['Rum and Coke', None, None, 'St Germain Spritz', 'Orange Margarita', None]
})
result = change_nulls(df)
assert result['drink'].tolist() == ['Rum and Coke', 'Rum and Coke', 'Rum and Coke', 'St Germain Spritz', 'Orange Margarita', 'Orange Margarita']
# Single row
df = pd.DataFrame({'id':[1], 'drink':['Latte']})
result = change_nulls(df)
assert result['drink'].tolist() == ['Latte']
# All drinks non-null
df = pd.DataFrame({'id':[1,2,3], 'drink':['Espresso','Cappuccino','Mocha']})
result = change_nulls(df)
assert result['drink'].tolist() == ['Espresso','Cappuccino','Mocha']
# Consecutive nulls after first
df = pd.DataFrame({'id':[1,2,3,4],'drink':['Tea', None, None, 'Coffee']})
result = change_nulls(df)
assert result['drink'].tolist() == ['Tea', 'Tea', 'Tea', 'Coffee']
| Test | Why |
|---|---|
| Provided example | Checks forward-fill with multiple NULLs |
| Single row | Edge case with minimal input |
| All non-null | No replacements needed |
| Consecutive nulls | Validates multiple consecutive NULLs are handled |
Edge Cases
One edge case is when the first row is non-NULL but all subsequent rows are NULL. Our algorithm handles this by continuously filling last_drink, producing the correct output.
Another edge case is a table with only one row. This is valid because the first row is guaranteed non-NULL. The algorithm will leave the row unchanged, which is correct.
A third edge case involves alternating NULL and non-NULL values. The algorithm correctly updates last_drink whenever a non-NULL appears, ensuring all NULLs in between are replaced with the appropriate preceding value. This prevents errors that a naive implementation scanning backward each time could miss.