LeetCode 2887 - Fill Missing Data

Difficulty: 🟢 Easy
Topics: —

Solution

Problem Understanding

The problem provides a Pandas DataFrame named products with three columns:

Column	Type
`name`	object
`quantity`	int
`price`	int

The task is to replace all missing values in the quantity column with 0.

In other words, some products may not have a recorded quantity, represented as None (or equivalently NaN in a Pandas DataFrame). Whenever this happens, we must substitute that missing value with 0, while leaving all existing valid quantities unchanged.

The input is a Pandas DataFrame where each row represents a product and contains its name, quantity, and price. The output should be the same DataFrame structure, except that every missing value in the quantity column has been filled with 0.

The problem is classified as Easy because it mainly tests familiarity with Pandas operations for handling missing data rather than requiring an advanced algorithmic technique.

An important observation is that only the quantity column should be modified. Other columns, such as name and price, must remain untouched. Additionally, rows with valid quantities should not be changed.

Several edge cases are worth considering. The DataFrame might contain no missing values at all, in which case the output should remain unchanged. The quantity column might consist entirely of missing values, requiring every row to become 0. The DataFrame could also contain only a single row, which should still be processed correctly.

Approaches

Brute Force Approach

A straightforward approach is to iterate through every row in the DataFrame and manually inspect the quantity column. For each row, we check whether the quantity is missing. If it is, we explicitly assign 0 to that row.

This approach works because every row is examined exactly once, ensuring that all missing values are detected and replaced. However, it is unnecessarily verbose and less efficient in practice because Pandas already provides optimized vectorized operations for this exact task.

For example, we could loop over indices and write conditional logic:

for index in products.index:
    if pd.isna(products.loc[index, "quantity"]):
        products.loc[index, "quantity"] = 0

Although correct, this solution is not idiomatic Pandas code.

Optimal Approach

The key observation is that Pandas includes a built in method called fillna() that efficiently replaces missing values in a column.

Instead of iterating row by row, we can directly target the quantity column and replace every missing value with 0 in one vectorized operation:

products["quantity"] = products["quantity"].fillna(0)

This approach is cleaner, easier to read, and internally optimized by Pandas.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n)	O(1)	Iterates through every row manually and updates missing values
Optimal	O(n)	O(1)	Uses Pandas `fillna()` for efficient vectorized replacement

Algorithm Walkthrough

Access the quantity column from the products DataFrame because this is the only column that requires modification.
Use the fillna(0) method on the column. This method scans through the column and replaces every missing value (None or NaN) with 0.
Assign the modified column back to products["quantity"] so the DataFrame reflects the updates.
Return the updated products DataFrame.

Why it works

The algorithm works because fillna(0) guarantees that every missing value in the selected column is replaced with 0, while preserving all non missing values exactly as they are. Since we only apply the operation to the quantity column, the rest of the DataFrame remains unchanged.

Python Solution

import pandas as pd

def fillMissingValues(products: pd.DataFrame) -> pd.DataFrame:
    products["quantity"] = products["quantity"].fillna(0)
    return products

The implementation directly follows the optimal algorithm. First, we select the quantity column using products["quantity"]. Then, we call fillna(0) to replace missing values with 0. The updated column is reassigned to the DataFrame, ensuring the modification persists. Finally, the modified DataFrame is returned.

This solution is concise and leverages Pandas vectorized operations, which are preferred over manual iteration for performance and readability.

Go Solution

LeetCode Pandas problems are Python specific and do not have a Go submission environment. However, if we conceptually translate the same logic into Go, the implementation would involve iterating through records and replacing missing quantities with 0.

package main

type Product struct {
	Name     string
	Quantity *int
	Price    int
}

func fillMissingValues(products []Product) []Product {
	for i := range products {
		if products[i].Quantity == nil {
			zero := 0
			products[i].Quantity = &zero
		}
	}

	return products
}

The main Go specific difference is that Go does not have built in DataFrame support like Pandas. Missing values are typically represented using pointers, where nil indicates absence. Instead of fillna(), we manually iterate through the slice and replace nil quantities with a pointer to 0.

Worked Examples

Example 1

Input DataFrame:

name	quantity	price
Wristwatch	None	135
WirelessEarbuds	None	821
GolfClubs	779	9319
Printer	849	3051

Step by Step Execution

We process the quantity column using:

products["quantity"] = products["quantity"].fillna(0)

State of the quantity column:

Row	Original Value	After `fillna(0)`
Wristwatch	None	0
WirelessEarbuds	None	0
GolfClubs	779	779
Printer	849	849

Final output:

name	quantity	price
Wristwatch	0	135
WirelessEarbuds	0	821
GolfClubs	779	9319
Printer	849	3051

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Pandas scans the `quantity` column once
Space	O(1)	Only a constant amount of extra space is used

The time complexity is O(n) because each row in the quantity column must be inspected once to determine whether it contains a missing value. The space complexity is O(1) because the operation modifies the column directly without requiring an additional data structure proportional to input size.

Test Cases

import pandas as pd

def fillMissingValues(products: pd.DataFrame) -> pd.DataFrame:
    products["quantity"] = products["quantity"].fillna(0)
    return products

# Test 1: Provided example
df = pd.DataFrame({
    "name": ["Wristwatch", "WirelessEarbuds", "GolfClubs", "Printer"],
    "quantity": [None, None, 779, 849],
    "price": [135, 821, 9319, 3051]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [0, 0, 779, 849]  # Example case

# Test 2: No missing values
df = pd.DataFrame({
    "name": ["A", "B"],
    "quantity": [10, 20],
    "price": [100, 200]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [10, 20]  # No modification needed

# Test 3: All values missing
df = pd.DataFrame({
    "name": ["A", "B", "C"],
    "quantity": [None, None, None],
    "price": [10, 20, 30]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [0, 0, 0]  # All replaced

# Test 4: Single row with missing value
df = pd.DataFrame({
    "name": ["Laptop"],
    "quantity": [None],
    "price": [999]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [0]  # Single element case

# Test 5: Single row with valid quantity
df = pd.DataFrame({
    "name": ["Laptop"],
    "quantity": [15],
    "price": [999]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [15]  # Existing value preserved

Test	Why
Provided example	Verifies correctness against the official example
No missing values	Ensures valid quantities remain unchanged
All values missing	Confirms every missing value becomes `0`
Single row with missing value	Tests minimum sized input with replacement
Single row with valid quantity	Ensures non missing values are preserved

Edge Cases

One important edge case is when the quantity column contains no missing values. A careless implementation might accidentally alter valid entries or perform unnecessary transformations. Using fillna(0) safely preserves all non missing values, so the DataFrame remains unchanged.

Another important case occurs when every quantity is missing. A manual implementation could fail if it assumes at least one valid number exists. The current implementation handles this naturally because fillna(0) independently replaces every missing value.

A final edge case is a DataFrame with only one row. Small inputs often expose indexing mistakes or assumptions about iteration. Since the solution operates directly on the column rather than relying on index ranges, single row inputs work correctly without any special handling.