LeetCode 2887 - Fill Missing Data

The problem provides a Pandas DataFrame named products with three columns: | Column | Type | | --- | --- | | name | object | | quantity | int | | price | int | The task is to replace all missing values in the quantity column with 0.

LeetCode Problem 2887

Difficulty: 🟢 Easy
Topics:

Solution

Problem Understanding

The problem provides a Pandas DataFrame named products with three columns:

Column Type
name object
quantity int
price int

The task is to replace all missing values in the quantity column with 0.

In other words, some products may not have a recorded quantity, represented as None (or equivalently NaN in a Pandas DataFrame). Whenever this happens, we must substitute that missing value with 0, while leaving all existing valid quantities unchanged.

The input is a Pandas DataFrame where each row represents a product and contains its name, quantity, and price. The output should be the same DataFrame structure, except that every missing value in the quantity column has been filled with 0.

The problem is classified as Easy because it mainly tests familiarity with Pandas operations for handling missing data rather than requiring an advanced algorithmic technique.

An important observation is that only the quantity column should be modified. Other columns, such as name and price, must remain untouched. Additionally, rows with valid quantities should not be changed.

Several edge cases are worth considering. The DataFrame might contain no missing values at all, in which case the output should remain unchanged. The quantity column might consist entirely of missing values, requiring every row to become 0. The DataFrame could also contain only a single row, which should still be processed correctly.

Approaches

Brute Force Approach

A straightforward approach is to iterate through every row in the DataFrame and manually inspect the quantity column. For each row, we check whether the quantity is missing. If it is, we explicitly assign 0 to that row.

This approach works because every row is examined exactly once, ensuring that all missing values are detected and replaced. However, it is unnecessarily verbose and less efficient in practice because Pandas already provides optimized vectorized operations for this exact task.

For example, we could loop over indices and write conditional logic:

for index in products.index:
    if pd.isna(products.loc[index, "quantity"]):
        products.loc[index, "quantity"] = 0

Although correct, this solution is not idiomatic Pandas code.

Optimal Approach

The key observation is that Pandas includes a built in method called fillna() that efficiently replaces missing values in a column.

Instead of iterating row by row, we can directly target the quantity column and replace every missing value with 0 in one vectorized operation:

products["quantity"] = products["quantity"].fillna(0)

This approach is cleaner, easier to read, and internally optimized by Pandas.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(1) Iterates through every row manually and updates missing values
Optimal O(n) O(1) Uses Pandas fillna() for efficient vectorized replacement

Algorithm Walkthrough

  1. Access the quantity column from the products DataFrame because this is the only column that requires modification.
  2. Use the fillna(0) method on the column. This method scans through the column and replaces every missing value (None or NaN) with 0.
  3. Assign the modified column back to products["quantity"] so the DataFrame reflects the updates.
  4. Return the updated products DataFrame.

Why it works

The algorithm works because fillna(0) guarantees that every missing value in the selected column is replaced with 0, while preserving all non missing values exactly as they are. Since we only apply the operation to the quantity column, the rest of the DataFrame remains unchanged.

Python Solution

import pandas as pd

def fillMissingValues(products: pd.DataFrame) -> pd.DataFrame:
    products["quantity"] = products["quantity"].fillna(0)
    return products

The implementation directly follows the optimal algorithm. First, we select the quantity column using products["quantity"]. Then, we call fillna(0) to replace missing values with 0. The updated column is reassigned to the DataFrame, ensuring the modification persists. Finally, the modified DataFrame is returned.

This solution is concise and leverages Pandas vectorized operations, which are preferred over manual iteration for performance and readability.

Go Solution

LeetCode Pandas problems are Python specific and do not have a Go submission environment. However, if we conceptually translate the same logic into Go, the implementation would involve iterating through records and replacing missing quantities with 0.

package main

type Product struct {
	Name     string
	Quantity *int
	Price    int
}

func fillMissingValues(products []Product) []Product {
	for i := range products {
		if products[i].Quantity == nil {
			zero := 0
			products[i].Quantity = &zero
		}
	}

	return products
}

The main Go specific difference is that Go does not have built in DataFrame support like Pandas. Missing values are typically represented using pointers, where nil indicates absence. Instead of fillna(), we manually iterate through the slice and replace nil quantities with a pointer to 0.

Worked Examples

Example 1

Input DataFrame:

name quantity price
Wristwatch None 135
WirelessEarbuds None 821
GolfClubs 779 9319
Printer 849 3051

Step by Step Execution

We process the quantity column using:

products["quantity"] = products["quantity"].fillna(0)

State of the quantity column:

Row Original Value After fillna(0)
Wristwatch None 0
WirelessEarbuds None 0
GolfClubs 779 779
Printer 849 849

Final output:

name quantity price
Wristwatch 0 135
WirelessEarbuds 0 821
GolfClubs 779 9319
Printer 849 3051

Complexity Analysis

Measure Complexity Explanation
Time O(n) Pandas scans the quantity column once
Space O(1) Only a constant amount of extra space is used

The time complexity is O(n) because each row in the quantity column must be inspected once to determine whether it contains a missing value. The space complexity is O(1) because the operation modifies the column directly without requiring an additional data structure proportional to input size.

Test Cases

import pandas as pd

def fillMissingValues(products: pd.DataFrame) -> pd.DataFrame:
    products["quantity"] = products["quantity"].fillna(0)
    return products

# Test 1: Provided example
df = pd.DataFrame({
    "name": ["Wristwatch", "WirelessEarbuds", "GolfClubs", "Printer"],
    "quantity": [None, None, 779, 849],
    "price": [135, 821, 9319, 3051]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [0, 0, 779, 849]  # Example case

# Test 2: No missing values
df = pd.DataFrame({
    "name": ["A", "B"],
    "quantity": [10, 20],
    "price": [100, 200]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [10, 20]  # No modification needed

# Test 3: All values missing
df = pd.DataFrame({
    "name": ["A", "B", "C"],
    "quantity": [None, None, None],
    "price": [10, 20, 30]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [0, 0, 0]  # All replaced

# Test 4: Single row with missing value
df = pd.DataFrame({
    "name": ["Laptop"],
    "quantity": [None],
    "price": [999]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [0]  # Single element case

# Test 5: Single row with valid quantity
df = pd.DataFrame({
    "name": ["Laptop"],
    "quantity": [15],
    "price": [999]
})

result = fillMissingValues(df)

assert result["quantity"].tolist() == [15]  # Existing value preserved
Test Why
Provided example Verifies correctness against the official example
No missing values Ensures valid quantities remain unchanged
All values missing Confirms every missing value becomes 0
Single row with missing value Tests minimum sized input with replacement
Single row with valid quantity Ensures non missing values are preserved

Edge Cases

One important edge case is when the quantity column contains no missing values. A careless implementation might accidentally alter valid entries or perform unnecessary transformations. Using fillna(0) safely preserves all non missing values, so the DataFrame remains unchanged.

Another important case occurs when every quantity is missing. A manual implementation could fail if it assumes at least one valid number exists. The current implementation handles this naturally because fillna(0) independently replaces every missing value.

A final edge case is a DataFrame with only one row. Small inputs often expose indexing mistakes or assumptions about iteration. Since the solution operates directly on the column rather than relying on index ranges, single row inputs work correctly without any special handling.