LeetCode 2891 - Method Chaining

This problem provides a Pandas DataFrame named animals with four columns: | Column | Type | | --- | --- | | name | object | | species | object | | age | int | | weight | int | The task is to return a new DataFrame containing only the names of animals whose weight is strictly…

LeetCode Problem 2891

Difficulty: 🟢 Easy
Topics:

Solution

Problem Understanding

This problem provides a Pandas DataFrame named animals with four columns:

Column Type
name object
species object
age int
weight int

The task is to return a new DataFrame containing only the names of animals whose weight is strictly greater than 100.

In addition to filtering the rows, the result must also be sorted by the weight column in descending order. That means the heaviest qualifying animal should appear first.

The problem specifically emphasizes method chaining in Pandas. Method chaining means combining multiple DataFrame operations together into a single expression instead of storing intermediate results in temporary variables.

In practical terms, we need to:

  1. Filter rows where weight > 100
  2. Sort those rows by weight descending
  3. Return only the name column

The output must remain a Pandas DataFrame, not a Python list or Series.

The input size is not explicitly stated, but this is an Easy Pandas problem and the operations involved are standard DataFrame transformations. Pandas handles filtering and sorting efficiently, so the solution is straightforward.

Several edge cases are important:

  • If no animals weigh more than 100, the returned DataFrame should be empty.
  • Animals with weight exactly equal to 100 should not be included because the condition is strictly greater than 100.
  • Multiple animals may have the same weight, which is acceptable because the problem only requires descending order.
  • The DataFrame may already be sorted or completely unsorted, so the algorithm must not assume any initial ordering.

Approaches

Brute Force Approach

A brute force solution would manually iterate through every row in the DataFrame, check whether the weight exceeds 100, store matching rows in a temporary structure, then sort the collected rows afterward.

Conceptually, the process would work like this:

  1. Traverse each row one by one.
  2. Check if weight > 100.
  3. If true, append the animal name and weight to a temporary list.
  4. Sort the temporary list by weight descending.
  5. Build the final DataFrame from the sorted results.

This approach is correct because every row is examined exactly once, and the final sorting guarantees the required ordering.

However, this solution is unnecessarily verbose and does not take advantage of Pandas' optimized vectorized operations. It also violates the spirit of the problem, which explicitly asks for method chaining in a single line.

Optimal Approach

The optimal solution uses Pandas method chaining to combine filtering, sorting, and column selection into one concise expression.

The key insight is that Pandas operations return new DataFrames, allowing multiple transformations to be chained together sequentially.

The operations naturally compose in this order:

  1. Filter rows using boolean indexing.
  2. Sort the filtered DataFrame.
  3. Select only the name column.

This solution is both concise and efficient because Pandas internally performs these operations using optimized implementations.

Approach Time Complexity Space Complexity Notes
Brute Force O(n log n) O(n) Manual iteration and sorting
Optimal O(n log n) O(n) Uses Pandas filtering and sorting with method chaining

Algorithm Walkthrough

  1. Start with the input DataFrame animals.
  2. Apply a filter condition using boolean indexing:
animals["weight"] > 100

This creates a boolean mask where rows satisfying the condition are marked True. 3. Use the boolean mask to keep only qualifying rows:

animals[animals["weight"] > 100]

At this stage, only animals heavier than 100 kilograms remain. 4. Sort the filtered DataFrame by the weight column in descending order:

.sort_values(by="weight", ascending=False)

This guarantees the heaviest animals appear first. 5. Select only the name column:

[["name"]]

Double brackets are used so the result remains a DataFrame instead of becoming a Series.

Why it works

The algorithm works because each transformation preserves exactly the information needed for the next step. The filtering step guarantees only valid animals remain. The sorting step guarantees descending weight order. The final column selection guarantees the output format matches the problem specification. Since every requirement is enforced directly through a dedicated DataFrame operation, the final result is correct.

Python Solution

import pandas as pd

class Solution:
    def findHeavyAnimals(self, animals: pd.DataFrame) -> pd.DataFrame:
        return (
            animals[animals["weight"] > 100]
	}

	sort.Slice(filtered, func(i, j int) bool {
		return filtered[i].Weight > filtered[j].Weight
	})

	result := make([]string, 0, len(filtered))

	for _, animal := range filtered {
		result = append(result, animal.Name)
	}

	return result
}

The Go version does not use Pandas because Go does not provide a built in DataFrame library comparable to Pandas.

Instead, the implementation uses a struct to represent each animal. The algorithm manually filters animals with Weight > 100, sorts them using sort.Slice, and then extracts the names into the final result slice.

Unlike Python Pandas, Go requires explicit iteration and manual memory management for slices. However, the overall logic remains identical.

Worked Examples

Example 1

Input DataFrame:

name species age weight
Tatiana Snake 98 464
Khaled Giraffe 50 41
Alex Leopard 6 328
Jonathan Monkey 45 463
Stefan Bear 100 50
Tommy Panda 26 349

Step 1: Filter weight > 100

name weight Keep?
Tatiana 464 Yes
Khaled 41 No
Alex 328 Yes
Jonathan 463 Yes
Stefan 50 No
Tommy 349 Yes

Filtered DataFrame:

name weight
Tatiana 464
Alex 328
Jonathan 463
Tommy 349

Step 2: Sort by weight descending

name weight
Tatiana 464
Jonathan 463
Tommy 349
Alex 328

Step 3: Select only name

name
Tatiana
Jonathan
Tommy
Alex

This matches the expected output.

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Filtering is O(n), sorting dominates with O(n log n)
Space O(n) Pandas creates filtered and sorted DataFrames

The filtering operation scans every row once, which costs O(n). The sorting operation requires O(n log n), which becomes the dominant factor. Additional space is used because Pandas returns new DataFrames instead of modifying the original in place.

Test Cases

import pandas as pd

solution = Solution()

# Example case
animals = pd.DataFrame({
    "name": ["Tatiana", "Khaled", "Alex", "Jonathan", "Stefan", "Tommy"],
    "species": ["Snake", "Giraffe", "Leopard", "Monkey", "Bear", "Panda"],
    "age": [98, 50, 6, 45, 100, 26],
    "weight": [464, 41, 328, 463, 50, 349]
})

result = solution.findHeavyAnimals(animals)

assert result["name"].tolist() == [
    "Tatiana",
    "Jonathan",
    "Tommy",
    "Alex"
]  # Standard example case

# No animals heavier than 100
animals = pd.DataFrame({
    "name": ["A", "B"],
    "species": ["Cat", "Dog"],
    "age": [2, 3],
    "weight": [50, 100]
})

result = solution.findHeavyAnimals(animals)

assert result.empty  # Should return empty DataFrame

# Exactly 100 should not qualify
animals = pd.DataFrame({
    "name": ["A", "B"],
    "species": ["Cat", "Dog"],
    "age": [1, 2],
    "weight": [100, 101]
})

result = solution.findHeavyAnimals(animals)

assert result["name"].tolist() == ["B"]  # Strict inequality test

# Multiple animals with same weight
animals = pd.DataFrame({
    "name": ["A", "B", "C"],
    "species": ["X", "Y", "Z"],
    "age": [1, 2, 3],
    "weight": [200, 200, 150]
})

result = solution.findHeavyAnimals(animals)

assert result["name"].tolist() == ["A", "B", "C"]  # Equal weights

# Single qualifying animal
animals = pd.DataFrame({
    "name": ["Solo"],
    "species": ["Tiger"],
    "age": [5],
    "weight": [300]
})

result = solution.findHeavyAnimals(animals)

assert result["name"].tolist() == ["Solo"]  # Single row case
Test Why
Standard example Validates normal filtering and sorting
No qualifying animals Ensures empty DataFrame handling
Weight exactly 100 Verifies strict inequality
Duplicate weights Confirms sorting still behaves correctly
Single qualifying animal Tests minimal valid output

Edge Cases

One important edge case occurs when no animals weigh more than 100 kilograms. A naive implementation might accidentally return None, a list, or raise an error when processing an empty result. This implementation handles the case naturally because Pandas filtering simply returns an empty DataFrame.

Another important edge case involves animals with weight exactly equal to 100. The problem explicitly requires weights to be strictly greater than 100. Using >= 100 instead of > 100 would produce incorrect results. The implementation avoids this mistake by using the exact condition animals["weight"] > 100.

A third edge case occurs when multiple animals share the same weight. Some sorting implementations may behave inconsistently or unexpectedly when duplicate values exist. Pandas sorting handles duplicate weights correctly and still maintains a valid descending order.

A final edge case is a DataFrame containing only one row. Some implementations accidentally convert single row outputs into scalars or Series objects. Using [["name"]] instead of ["name"] guarantees the output always remains a DataFrame, even for one row or zero rows.