LeetCode 2889 - Reshape Data: Pivot

This problem gives us a Pandas DataFrame named weather with three columns: - city, the name of a city - month, the name of a month - temperature, the recorded temperature for that city during that month The task is to reshape the table using a pivot operation.

LeetCode Problem 2889

Difficulty: 🟢 Easy
Topics:

Solution

Problem Understanding

This problem gives us a Pandas DataFrame named weather with three columns:

  • city, the name of a city
  • month, the name of a month
  • temperature, the recorded temperature for that city during that month

The task is to reshape the table using a pivot operation. Instead of storing each (city, month, temperature) combination as a separate row, we want:

  • each row to represent a single month
  • each column to represent a city
  • each cell to contain the temperature for that city and month

In other words, we are transforming the data from a "long" format into a "wide" format.

For example, the original table may look like this:

city month temperature
Jacksonville January 13
ElPaso January 20

After pivoting, the result becomes:

month ElPaso Jacksonville
January 20 13

The month values become the row index, the city values become column names, and the temperature values fill the table cells.

The problem guarantees that the input data is valid for pivoting. That means there is at most one temperature value for each (month, city) pair. If duplicates existed, a normal pivot operation would fail because Pandas would not know which value to place into the cell.

The input size is small because this is an Easy difficulty Pandas problem. The focus is not on algorithmic optimization, but on understanding how to correctly use DataFrame reshaping operations.

An important detail is that the output format shown in the example keeps month as a regular column rather than as the DataFrame index. Since pivot() typically places the pivot key into the index, we must reset the index afterward.

Potential edge cases include:

  • only one city in the dataset
  • only one month in the dataset
  • unordered input rows
  • missing combinations of (month, city)

Pandas handles missing combinations automatically by inserting NaN values in the resulting table.

Approaches

Brute Force Approach

A brute force solution would manually construct the pivot table.

We could first collect all unique months and all unique cities. Then, for every month, we would iterate through all rows of the DataFrame and search for matching city records. Whenever we find a row whose month matches the current month, we place its temperature into the appropriate city column.

This approach works because it explicitly rebuilds the target table cell by cell. However, it is unnecessarily inefficient because it repeatedly scans the entire dataset for every month.

The main issue is that repeated filtering or scanning creates redundant work.

Optimal Approach

The optimal solution uses Pandas' built in pivot() operation.

The key insight is that the problem is exactly a standard pivot-table transformation:

  • rows should be indexed by month
  • columns should be grouped by city
  • values should come from temperature

Pandas already provides a highly optimized function for this transformation. Instead of manually reconstructing the table, we can directly express the reshaping operation declaratively.

After pivoting, we call reset_index() so that month becomes a normal column again.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(m × n) O(m × c) Repeatedly scans rows for each month
Optimal O(n) O(m × c) Uses Pandas built in pivot operation

Here:

  • n is the number of rows
  • m is the number of unique months
  • c is the number of unique cities

Algorithm Walkthrough

  1. Start with the original weather DataFrame containing city, month, and temperature.
  2. Call the pivot() method on the DataFrame.

We use:

  • index="month" because each row should represent a month
  • columns="city" because each city should become its own column
  • values="temperature" because temperatures are the values stored inside the table
  1. The result of pivot() creates a DataFrame where:
  • the index is the month
  • the columns are city names
  • the cells contain temperatures
  1. Call reset_index().

This step converts the month index back into a normal DataFrame column so the output matches the required format. 5. Return the transformed DataFrame.

Why it works

The pivot operation guarantees that each unique (month, city) pair maps to exactly one cell in the output table. Since the problem guarantees valid input data, every temperature value is placed into the correct position. The transformation preserves all original information while reorganizing the layout into the required wide-table format.

Python Solution

import pandas as pd

class Solution:
    def pivotTable(self, weather: pd.DataFrame) -> pd.DataFrame:
        return weather.pivot(
            index="month",
            columns="city",
            values="temperature"
        ).reset_index()

The implementation is very compact because Pandas already provides the exact transformation we need.

The pivot() function performs the reshaping operation:

  • index="month" groups rows by month
  • columns="city" creates one column per city
  • values="temperature" fills the table cells with temperatures

After pivoting, month becomes the DataFrame index. The problem expects it to remain a normal column, so we call reset_index() before returning the result.

This solution directly mirrors the conceptual transformation described in the algorithm walkthrough.

Go Solution

package main

import "fmt"

type Record struct {
	City        string
	Month       string
	Temperature int
}

func pivotTable(weather []Record) []map[string]interface{} {
	monthMap := make(map[string]map[string]interface{})

	for _, record := range weather {
		if _, exists := monthMap[record.Month]; !exists {
			monthMap[record.Month] = map[string]interface{}{
				"month": record.Month,
			}
		}

		monthMap[record.Month][record.City] = record.Temperature
	}

	result := make([]map[string]interface{}, 0)

	for _, row := range monthMap {
		result = append(result, row)
	}

	return result
}

func main() {
	weather := []Record{
		{"Jacksonville", "January", 13},
		{"ElPaso", "January", 20},
	}

	fmt.Println(pivotTable(weather))
}

Unlike Python Pandas, Go does not provide a built in DataFrame pivot operation. Therefore, we manually construct the reshaped table using nested maps.

The outer map groups rows by month, while the inner map stores city temperature pairs for that month.

One implementation detail is that Go maps are unordered, so the output ordering may differ unless additional sorting logic is added. Python Pandas automatically preserves a structured tabular representation.

Worked Examples

Example 1

Input:

city month temperature
Jacksonville January 13
Jacksonville February 23
Jacksonville March 38
Jacksonville April 5
Jacksonville May 34
ElPaso January 20
ElPaso February 6
ElPaso March 26
ElPaso April 2
ElPaso May 43

Step 1, Original DataFrame

city month temperature
Jacksonville January 13
Jacksonville February 23
... ... ...

Step 2, Apply Pivot

We execute:

weather.pivot(
    index="month",
    columns="city",
    values="temperature"
)

Pandas internally groups rows by month and creates city columns.

Intermediate structure:

month ElPaso Jacksonville
January 20 13
February 6 23
March 26 38
April 2 5
May 43 34

At this point, month is the DataFrame index.

Step 3, Reset Index

After calling:

.reset_index()

Final result:

month ElPaso Jacksonville
April 2 5
February 6 23
January 20 13
March 26 38
May 43 34

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each row is processed once during pivot construction
Space O(m × c) The pivot table stores one cell per month-city combination

The pivot operation processes each input row once to determine its destination cell in the reshaped table. The resulting DataFrame requires storage proportional to the number of unique months multiplied by the number of unique cities.

Test Cases

import pandas as pd

solution = Solution()

# Basic example from the prompt
weather1 = pd.DataFrame({
    "city": ["Jacksonville", "Jacksonville", "ElPaso", "ElPaso"],
    "month": ["January", "February", "January", "February"],
    "temperature": [13, 23, 20, 6]
})

result1 = solution.pivotTable(weather1)

assert "Jacksonville" in result1.columns  # city column created
assert "ElPaso" in result1.columns        # another city column created

# Single city
weather2 = pd.DataFrame({
    "city": ["A", "A"],
    "month": ["Jan", "Feb"],
    "temperature": [1, 2]
})

result2 = solution.pivotTable(weather2)

assert result2.shape == (2, 2)  # month + one city column

# Single month
weather3 = pd.DataFrame({
    "city": ["A", "B"],
    "month": ["Jan", "Jan"],
    "temperature": [5, 10]
})

result3 = solution.pivotTable(weather3)

assert result3.shape == (1, 3)  # one row, two city columns

# Unordered input rows
weather4 = pd.DataFrame({
    "city": ["B", "A"],
    "month": ["Feb", "Jan"],
    "temperature": [8, 3]
})

result4 = solution.pivotTable(weather4)

assert set(result4["month"]) == {"Jan", "Feb"}  # months preserved

# Missing city-month combinations
weather5 = pd.DataFrame({
    "city": ["A"],
    "month": ["Jan"],
    "temperature": [100]
})

result5 = solution.pivotTable(weather5)

assert result5.loc[0, "A"] == 100  # value placed correctly

Test Case Summary

Test Why
Basic example Validates standard pivot transformation
Single city Ensures one-column pivot works correctly
Single month Ensures one-row pivot works correctly
Unordered rows Confirms input order does not matter
Missing combinations Verifies sparse data is handled properly

Edge Cases

Single City Input

If the dataset contains only one city, the pivoted result should still work correctly and produce exactly one city column. A buggy implementation might incorrectly assume multiple city columns exist or mishandle dimensions during reshaping. Using Pandas pivot() naturally handles this case without additional logic.

Single Month Input

When all temperature records belong to the same month, the output should contain only one row. Some manual implementations may accidentally duplicate rows or fail to group correctly. The pivot operation guarantees that all cities for the same month are merged into a single row.

Missing Month-City Combinations

Not every city must necessarily appear for every month. In such cases, the pivot table contains missing values (NaN) for absent combinations. A naive manual solution might crash or incorrectly initialize missing cells. Pandas handles missing entries automatically and preserves the table structure correctly.

Unordered Input Data

The input rows may appear in arbitrary order. A manual implementation that depends on sorted input could produce incorrect grouping behavior. The pivot operation groups by keys rather than relying on row ordering, so the implementation remains correct regardless of input arrangement.