LeetCode 2891 - Method Chaining

Difficulty: 🟢 Easy
Topics: —

Solution

Problem Understanding

This problem provides a Pandas DataFrame named animals with four columns:

Column	Type
`name`	object
`species`	object
`age`	int
`weight`	int

The task is to return a new DataFrame containing only the names of animals whose weight is strictly greater than 100.

In addition to filtering the rows, the result must also be sorted by the weight column in descending order. That means the heaviest qualifying animal should appear first.

The problem specifically emphasizes method chaining in Pandas. Method chaining means combining multiple DataFrame operations together into a single expression instead of storing intermediate results in temporary variables.

In practical terms, we need to:

Filter rows where weight > 100
Sort those rows by weight descending
Return only the name column

The output must remain a Pandas DataFrame, not a Python list or Series.

The input size is not explicitly stated, but this is an Easy Pandas problem and the operations involved are standard DataFrame transformations. Pandas handles filtering and sorting efficiently, so the solution is straightforward.

Several edge cases are important:

If no animals weigh more than 100, the returned DataFrame should be empty.
Animals with weight exactly equal to 100 should not be included because the condition is strictly greater than 100.
Multiple animals may have the same weight, which is acceptable because the problem only requires descending order.
The DataFrame may already be sorted or completely unsorted, so the algorithm must not assume any initial ordering.

Approaches

Brute Force Approach

A brute force solution would manually iterate through every row in the DataFrame, check whether the weight exceeds 100, store matching rows in a temporary structure, then sort the collected rows afterward.

Conceptually, the process would work like this:

Traverse each row one by one.
Check if weight > 100.
If true, append the animal name and weight to a temporary list.
Sort the temporary list by weight descending.
Build the final DataFrame from the sorted results.

This approach is correct because every row is examined exactly once, and the final sorting guarantees the required ordering.

However, this solution is unnecessarily verbose and does not take advantage of Pandas' optimized vectorized operations. It also violates the spirit of the problem, which explicitly asks for method chaining in a single line.

Optimal Approach

The optimal solution uses Pandas method chaining to combine filtering, sorting, and column selection into one concise expression.

The key insight is that Pandas operations return new DataFrames, allowing multiple transformations to be chained together sequentially.

The operations naturally compose in this order:

Filter rows using boolean indexing.
Sort the filtered DataFrame.
Select only the name column.

This solution is both concise and efficient because Pandas internally performs these operations using optimized implementations.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n log n)	O(n)	Manual iteration and sorting
Optimal	O(n log n)	O(n)	Uses Pandas filtering and sorting with method chaining

Algorithm Walkthrough

Start with the input DataFrame animals.
Apply a filter condition using boolean indexing:

animals["weight"] > 100

This creates a boolean mask where rows satisfying the condition are marked True. 3. Use the boolean mask to keep only qualifying rows:

animals[animals["weight"] > 100]

At this stage, only animals heavier than 100 kilograms remain. 4. Sort the filtered DataFrame by the weight column in descending order:

.sort_values(by="weight", ascending=False)

This guarantees the heaviest animals appear first. 5. Select only the name column:

[["name"]]

Double brackets are used so the result remains a DataFrame instead of becoming a Series.

Why it works

The algorithm works because each transformation preserves exactly the information needed for the next step. The filtering step guarantees only valid animals remain. The sorting step guarantees descending weight order. The final column selection guarantees the output format matches the problem specification. Since every requirement is enforced directly through a dedicated DataFrame operation, the final result is correct.

Python Solution

import pandas as pd

class Solution:
    def findHeavyAnimals(self, animals: pd.DataFrame) -> pd.DataFrame:
        return (
            animals[animals["weight"] > 100]
	}

	sort.Slice(filtered, func(i, j int) bool {
		return filtered[i].Weight > filtered[j].Weight
	})

	result := make([]string, 0, len(filtered))

	for _, animal := range filtered {
		result = append(result, animal.Name)
	}

	return result
}

The Go version does not use Pandas because Go does not provide a built in DataFrame library comparable to Pandas.

Instead, the implementation uses a struct to represent each animal. The algorithm manually filters animals with Weight > 100, sorts them using sort.Slice, and then extracts the names into the final result slice.

Unlike Python Pandas, Go requires explicit iteration and manual memory management for slices. However, the overall logic remains identical.

Worked Examples

Example 1

Input DataFrame:

name	species	age	weight
Tatiana	Snake	98	464
Khaled	Giraffe	50	41
Alex	Leopard	6	328
Jonathan	Monkey	45	463
Stefan	Bear	100	50
Tommy	Panda	26	349

Step 1: Filter `weight > 100`

name	weight	Keep?
Tatiana	464	Yes
Khaled	41	No
Alex	328	Yes
Jonathan	463	Yes
Stefan	50	No
Tommy	349	Yes

Filtered DataFrame:

name	weight
Tatiana	464
Alex	328
Jonathan	463
Tommy	349

Step 2: Sort by weight descending

name	weight
Tatiana	464
Jonathan	463
Tommy	349
Alex	328

Step 3: Select only `name`

name
Tatiana
Jonathan
Tommy
Alex

This matches the expected output.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n log n)	Filtering is O(n), sorting dominates with O(n log n)
Space	O(n)	Pandas creates filtered and sorted DataFrames

The filtering operation scans every row once, which costs O(n). The sorting operation requires O(n log n), which becomes the dominant factor. Additional space is used because Pandas returns new DataFrames instead of modifying the original in place.

Test Cases

import pandas as pd

solution = Solution()

# Example case
animals = pd.DataFrame({
    "name": ["Tatiana", "Khaled", "Alex", "Jonathan", "Stefan", "Tommy"],
    "species": ["Snake", "Giraffe", "Leopard", "Monkey", "Bear", "Panda"],
    "age": [98, 50, 6, 45, 100, 26],
    "weight": [464, 41, 328, 463, 50, 349]
})

result = solution.findHeavyAnimals(animals)

assert result["name"].tolist() == [
    "Tatiana",
    "Jonathan",
    "Tommy",
    "Alex"
]  # Standard example case

# No animals heavier than 100
animals = pd.DataFrame({
    "name": ["A", "B"],
    "species": ["Cat", "Dog"],
    "age": [2, 3],
    "weight": [50, 100]
})

result = solution.findHeavyAnimals(animals)

assert result.empty  # Should return empty DataFrame

# Exactly 100 should not qualify
animals = pd.DataFrame({
    "name": ["A", "B"],
    "species": ["Cat", "Dog"],
    "age": [1, 2],
    "weight": [100, 101]
})

result = solution.findHeavyAnimals(animals)

assert result["name"].tolist() == ["B"]  # Strict inequality test

# Multiple animals with same weight
animals = pd.DataFrame({
    "name": ["A", "B", "C"],
    "species": ["X", "Y", "Z"],
    "age": [1, 2, 3],
    "weight": [200, 200, 150]
})

result = solution.findHeavyAnimals(animals)

assert result["name"].tolist() == ["A", "B", "C"]  # Equal weights

# Single qualifying animal
animals = pd.DataFrame({
    "name": ["Solo"],
    "species": ["Tiger"],
    "age": [5],
    "weight": [300]
})

result = solution.findHeavyAnimals(animals)

assert result["name"].tolist() == ["Solo"]  # Single row case

Test	Why
Standard example	Validates normal filtering and sorting
No qualifying animals	Ensures empty DataFrame handling
Weight exactly 100	Verifies strict inequality
Duplicate weights	Confirms sorting still behaves correctly
Single qualifying animal	Tests minimal valid output

Edge Cases

One important edge case occurs when no animals weigh more than 100 kilograms. A naive implementation might accidentally return None, a list, or raise an error when processing an empty result. This implementation handles the case naturally because Pandas filtering simply returns an empty DataFrame.

Another important edge case involves animals with weight exactly equal to 100. The problem explicitly requires weights to be strictly greater than 100. Using >= 100 instead of > 100 would produce incorrect results. The implementation avoids this mistake by using the exact condition animals["weight"] > 100.

A third edge case occurs when multiple animals share the same weight. Some sorting implementations may behave inconsistently or unexpectedly when duplicate values exist. Pandas sorting handles duplicate weights correctly and still maintains a valid descending order.

A final edge case is a DataFrame containing only one row. Some implementations accidentally convert single row outputs into scalars or Series objects. Using [["name"]] instead of ["name"] guarantees the output always remains a DataFrame, even for one row or zero rows.