LeetCode 2388 - Change Null Values in a Table to the Previous Value

The problem asks us to process a database table, CoffeeShop, which contains two columns: id and drink. Each row represents a drink order. Some drink values may be NULL.

LeetCode Problem 2388

Difficulty: 🟡 Medium
Topics: Database

Solution

Problem Understanding

The problem asks us to process a database table, CoffeeShop, which contains two columns: id and drink. Each row represents a drink order. Some drink values may be NULL. The task is to replace every NULL in the drink column with the most recent non-NULL value from the previous rows, preserving the order of rows as given by the id sequence in the table.

In other words, we are performing a forward-fill operation on the drink column, similar to what is often done in time series data to fill missing values. The problem guarantees that the first row always has a non-NULL value, so we never have to handle a NULL without a previous value to reference.

The expected output is the same table, with all NULL values replaced by the most recent previous non-NULL drink, keeping the original row order intact.

Key constraints and considerations are that the table can have arbitrary ordering of ids, so a simple sequential scan is required, and we must handle consecutive NULLs correctly.

Important edge cases include consecutive NULLs, the last row being NULL, and minimal tables (one or two rows). The guarantee that the first row is not NULL simplifies our handling.

Approaches

A brute-force approach would iterate over each row, and for every NULL encountered, perform another backward scan to find the nearest previous non-NULL drink. This approach is correct because it ensures every NULL is replaced with the immediate prior value, but it is inefficient because it could require scanning almost the entire table for every NULL, resulting in a time complexity of O(n²) in the worst case.

The optimal approach leverages a forward pass using a simple variable to keep track of the last non-NULL drink encountered. As we iterate over the rows in order, if a row's drink is not NULL, we update this variable. If it is NULL, we replace it immediately with the stored last non-NULL drink. This avoids repeated backward scanning and reduces the time complexity to O(n) with constant extra space.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) For each NULL, scan backwards to find previous value
Optimal O(n) O(1) Forward scan storing last non-NULL value

Algorithm Walkthrough

  1. Initialize a variable last_drink to NULL. This variable will track the most recent non-NULL drink as we scan.
  2. Iterate over each row of the CoffeeShop table in order of appearance (id order is preserved in input).
  3. For each row, check if the drink value is NULL.
  4. If the drink is not NULL, update last_drink to this value.
  5. If the drink is NULL, replace it with the value stored in last_drink.
  6. Continue until all rows have been processed.
  7. Return the modified table, preserving the original order.

Why it works: The forward-scan invariant ensures that at every step, last_drink contains the most recent non-NULL value seen so far. This guarantees that every NULL is replaced with the correct preceding value without needing any backward scanning.

Python Solution

import pandas as pd

def change_nulls(coffee_shop: pd.DataFrame) -> pd.DataFrame:
    last_drink = None
    drinks = coffee_shop['drink'].tolist()
    for i in range(len(drinks)):
        if drinks[i] is not None:
            last_drink = drinks[i]
        else:
            drinks[i] = last_drink
    coffee_shop['drink'] = drinks
    return coffee_shop

In this implementation, we first convert the drink column to a list to allow in-place modification. We iterate through the list, updating last_drink whenever we encounter a non-NULL value. When a NULL is found, it is replaced immediately. Finally, we update the DataFrame column and return the result.

Go Solution

package main

type CoffeeShop struct {
    Id    int
    Drink *string
}

func changeNulls(coffeeShop []CoffeeShop) []CoffeeShop {
    var lastDrink string
    for i := 0; i < len(coffeeShop); i++ {
        if coffeeShop[i].Drink != nil {
            lastDrink = *coffeeShop[i].Drink
        } else {
            coffeeShop[i].Drink = &lastDrink
        }
    }
    return coffeeShop
}

In Go, we use a slice of CoffeeShop structs. The Drink field is a pointer to string to accommodate NULL values. As we iterate, we update the lastDrink variable when a non-nil value is encountered and replace nil pointers with a reference to lastDrink for NULL entries.

Worked Examples

For the input:

+----+-------------------+
| id | drink             |
+----+-------------------+
| 9  | Rum and Coke      |
| 6  | null              |
| 7  | null              |
| 3  | St Germain Spritz |
| 1  | Orange Margarita  |
| 2  | null              |
+----+-------------------+

Step by step state of last_drink:

Row Original drink last_drink Updated drink
9 Rum and Coke Rum and Coke Rum and Coke
6 null Rum and Coke Rum and Coke
7 null Rum and Coke Rum and Coke
3 St Germain Spritz St Germain Spritz St Germain Spritz
1 Orange Margarita Orange Margarita Orange Margarita
2 null Orange Margarita Orange Margarita

Complexity Analysis

Measure Complexity Explanation
Time O(n) Single pass through all rows, each row processed once
Space O(1) Only one variable last_drink is used for tracking

The time complexity is linear because each row is visited exactly once. Space complexity is constant as we only store a single variable for the last non-NULL drink.

Test Cases

import pandas as pd

# Basic provided example
df = pd.DataFrame({
    'id': [9,6,7,3,1,2],
    'drink': ['Rum and Coke', None, None, 'St Germain Spritz', 'Orange Margarita', None]
})
result = change_nulls(df)
assert result['drink'].tolist() == ['Rum and Coke', 'Rum and Coke', 'Rum and Coke', 'St Germain Spritz', 'Orange Margarita', 'Orange Margarita']

# Single row
df = pd.DataFrame({'id':[1], 'drink':['Latte']})
result = change_nulls(df)
assert result['drink'].tolist() == ['Latte']

# All drinks non-null
df = pd.DataFrame({'id':[1,2,3], 'drink':['Espresso','Cappuccino','Mocha']})
result = change_nulls(df)
assert result['drink'].tolist() == ['Espresso','Cappuccino','Mocha']

# Consecutive nulls after first
df = pd.DataFrame({'id':[1,2,3,4],'drink':['Tea', None, None, 'Coffee']})
result = change_nulls(df)
assert result['drink'].tolist() == ['Tea', 'Tea', 'Tea', 'Coffee']
Test Why
Provided example Checks forward-fill with multiple NULLs
Single row Edge case with minimal input
All non-null No replacements needed
Consecutive nulls Validates multiple consecutive NULLs are handled

Edge Cases

One edge case is when the first row is non-NULL but all subsequent rows are NULL. Our algorithm handles this by continuously filling last_drink, producing the correct output.

Another edge case is a table with only one row. This is valid because the first row is guaranteed non-NULL. The algorithm will leave the row unchanged, which is correct.

A third edge case involves alternating NULL and non-NULL values. The algorithm correctly updates last_drink whenever a non-NULL appears, ensuring all NULLs in between are replaced with the appropriate preceding value. This prevents errors that a naive implementation scanning backward each time could miss.