LeetCode 2889 - Reshape Data: Pivot

Difficulty: 🟢 Easy
Topics: —

Solution

Problem Understanding

This problem gives us a Pandas DataFrame named weather with three columns:

city, the name of a city
month, the name of a month
temperature, the recorded temperature for that city during that month

The task is to reshape the table using a pivot operation. Instead of storing each (city, month, temperature) combination as a separate row, we want:

each row to represent a single month
each column to represent a city
each cell to contain the temperature for that city and month

In other words, we are transforming the data from a "long" format into a "wide" format.

For example, the original table may look like this:

city	month	temperature
Jacksonville	January	13
ElPaso	January	20

After pivoting, the result becomes:

month	ElPaso	Jacksonville
January	20	13

The month values become the row index, the city values become column names, and the temperature values fill the table cells.

The problem guarantees that the input data is valid for pivoting. That means there is at most one temperature value for each (month, city) pair. If duplicates existed, a normal pivot operation would fail because Pandas would not know which value to place into the cell.

The input size is small because this is an Easy difficulty Pandas problem. The focus is not on algorithmic optimization, but on understanding how to correctly use DataFrame reshaping operations.

An important detail is that the output format shown in the example keeps month as a regular column rather than as the DataFrame index. Since pivot() typically places the pivot key into the index, we must reset the index afterward.

Potential edge cases include:

only one city in the dataset
only one month in the dataset
unordered input rows
missing combinations of (month, city)

Pandas handles missing combinations automatically by inserting NaN values in the resulting table.

Approaches

Brute Force Approach

A brute force solution would manually construct the pivot table.

We could first collect all unique months and all unique cities. Then, for every month, we would iterate through all rows of the DataFrame and search for matching city records. Whenever we find a row whose month matches the current month, we place its temperature into the appropriate city column.

This approach works because it explicitly rebuilds the target table cell by cell. However, it is unnecessarily inefficient because it repeatedly scans the entire dataset for every month.

The main issue is that repeated filtering or scanning creates redundant work.

Optimal Approach

The optimal solution uses Pandas' built in pivot() operation.

The key insight is that the problem is exactly a standard pivot-table transformation:

rows should be indexed by month
columns should be grouped by city
values should come from temperature

Pandas already provides a highly optimized function for this transformation. Instead of manually reconstructing the table, we can directly express the reshaping operation declaratively.

After pivoting, we call reset_index() so that month becomes a normal column again.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(m × n)	O(m × c)	Repeatedly scans rows for each month
Optimal	O(n)	O(m × c)	Uses Pandas built in pivot operation

Here:

n is the number of rows
m is the number of unique months
c is the number of unique cities

Algorithm Walkthrough

Start with the original weather DataFrame containing city, month, and temperature.
Call the pivot() method on the DataFrame.

We use:

index="month" because each row should represent a month
columns="city" because each city should become its own column
values="temperature" because temperatures are the values stored inside the table

The result of pivot() creates a DataFrame where:

the index is the month
the columns are city names
the cells contain temperatures

Call reset_index().

This step converts the month index back into a normal DataFrame column so the output matches the required format. 5. Return the transformed DataFrame.

Why it works

The pivot operation guarantees that each unique (month, city) pair maps to exactly one cell in the output table. Since the problem guarantees valid input data, every temperature value is placed into the correct position. The transformation preserves all original information while reorganizing the layout into the required wide-table format.

Python Solution

import pandas as pd

class Solution:
    def pivotTable(self, weather: pd.DataFrame) -> pd.DataFrame:
        return weather.pivot(
            index="month",
            columns="city",
            values="temperature"
        ).reset_index()

The implementation is very compact because Pandas already provides the exact transformation we need.

The pivot() function performs the reshaping operation:

index="month" groups rows by month
columns="city" creates one column per city
values="temperature" fills the table cells with temperatures

After pivoting, month becomes the DataFrame index. The problem expects it to remain a normal column, so we call reset_index() before returning the result.

This solution directly mirrors the conceptual transformation described in the algorithm walkthrough.

Go Solution

package main

import "fmt"

type Record struct {
	City        string
	Month       string
	Temperature int
}

func pivotTable(weather []Record) []map[string]interface{} {
	monthMap := make(map[string]map[string]interface{})

	for _, record := range weather {
		if _, exists := monthMap[record.Month]; !exists {
			monthMap[record.Month] = map[string]interface{}{
				"month": record.Month,
			}
		}

		monthMap[record.Month][record.City] = record.Temperature
	}

	result := make([]map[string]interface{}, 0)

	for _, row := range monthMap {
		result = append(result, row)
	}

	return result
}

func main() {
	weather := []Record{
		{"Jacksonville", "January", 13},
		{"ElPaso", "January", 20},
	}

	fmt.Println(pivotTable(weather))
}

Unlike Python Pandas, Go does not provide a built in DataFrame pivot operation. Therefore, we manually construct the reshaped table using nested maps.

The outer map groups rows by month, while the inner map stores city temperature pairs for that month.

One implementation detail is that Go maps are unordered, so the output ordering may differ unless additional sorting logic is added. Python Pandas automatically preserves a structured tabular representation.

Worked Examples

Example 1

Input:

city	month	temperature
Jacksonville	January	13
Jacksonville	February	23
Jacksonville	March	38
Jacksonville	April	5
Jacksonville	May	34
ElPaso	January	20
ElPaso	February	6
ElPaso	March	26
ElPaso	April	2
ElPaso	May	43

Step 1, Original DataFrame

city	month	temperature
Jacksonville	January	13
Jacksonville	February	23
...	...	...

Step 2, Apply Pivot

We execute:

weather.pivot(
    index="month",
    columns="city",
    values="temperature"
)

Pandas internally groups rows by month and creates city columns.

Intermediate structure:

month	ElPaso	Jacksonville
January	20	13
February	6	23
March	26	38
April	2	5
May	43	34

At this point, month is the DataFrame index.

Step 3, Reset Index

After calling:

.reset_index()

Final result:

month	ElPaso	Jacksonville
April	2	5
February	6	23
January	20	13
March	26	38
May	43	34

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Each row is processed once during pivot construction
Space	O(m × c)	The pivot table stores one cell per month-city combination

The pivot operation processes each input row once to determine its destination cell in the reshaped table. The resulting DataFrame requires storage proportional to the number of unique months multiplied by the number of unique cities.

Test Cases

import pandas as pd

solution = Solution()

# Basic example from the prompt
weather1 = pd.DataFrame({
    "city": ["Jacksonville", "Jacksonville", "ElPaso", "ElPaso"],
    "month": ["January", "February", "January", "February"],
    "temperature": [13, 23, 20, 6]
})

result1 = solution.pivotTable(weather1)

assert "Jacksonville" in result1.columns  # city column created
assert "ElPaso" in result1.columns        # another city column created

# Single city
weather2 = pd.DataFrame({
    "city": ["A", "A"],
    "month": ["Jan", "Feb"],
    "temperature": [1, 2]
})

result2 = solution.pivotTable(weather2)

assert result2.shape == (2, 2)  # month + one city column

# Single month
weather3 = pd.DataFrame({
    "city": ["A", "B"],
    "month": ["Jan", "Jan"],
    "temperature": [5, 10]
})

result3 = solution.pivotTable(weather3)

assert result3.shape == (1, 3)  # one row, two city columns

# Unordered input rows
weather4 = pd.DataFrame({
    "city": ["B", "A"],
    "month": ["Feb", "Jan"],
    "temperature": [8, 3]
})

result4 = solution.pivotTable(weather4)

assert set(result4["month"]) == {"Jan", "Feb"}  # months preserved

# Missing city-month combinations
weather5 = pd.DataFrame({
    "city": ["A"],
    "month": ["Jan"],
    "temperature": [100]
})

result5 = solution.pivotTable(weather5)

assert result5.loc[0, "A"] == 100  # value placed correctly

Test Case Summary

Test	Why
Basic example	Validates standard pivot transformation
Single city	Ensures one-column pivot works correctly
Single month	Ensures one-row pivot works correctly
Unordered rows	Confirms input order does not matter
Missing combinations	Verifies sparse data is handled properly

Edge Cases

Single City Input

If the dataset contains only one city, the pivoted result should still work correctly and produce exactly one city column. A buggy implementation might incorrectly assume multiple city columns exist or mishandle dimensions during reshaping. Using Pandas pivot() naturally handles this case without additional logic.

Single Month Input

When all temperature records belong to the same month, the output should contain only one row. Some manual implementations may accidentally duplicate rows or fail to group correctly. The pivot operation guarantees that all cities for the same month are merged into a single row.

Missing Month-City Combinations

Not every city must necessarily appear for every month. In such cases, the pivot table contains missing values (NaN) for absent combinations. A naive manual solution might crash or incorrectly initialize missing cells. Pandas handles missing entries automatically and preserves the table structure correctly.

Unordered Input Data

The input rows may appear in arbitrary order. A manual implementation that depends on sorted input could produce incorrect grouping behavior. The pivot operation groups by keys rather than relying on row ordering, so the implementation remains correct regardless of input arrangement.