LeetCode 2889 - Reshape Data: Pivot
This problem gives us a Pandas DataFrame named weather with three columns: - city, the name of a city - month, the name of a month - temperature, the recorded temperature for that city during that month The task is to reshape the table using a pivot operation.
Difficulty: 🟢 Easy
Topics: —
Solution
Problem Understanding
This problem gives us a Pandas DataFrame named weather with three columns:
city, the name of a citymonth, the name of a monthtemperature, the recorded temperature for that city during that month
The task is to reshape the table using a pivot operation. Instead of storing each (city, month, temperature) combination as a separate row, we want:
- each row to represent a single month
- each column to represent a city
- each cell to contain the temperature for that city and month
In other words, we are transforming the data from a "long" format into a "wide" format.
For example, the original table may look like this:
| city | month | temperature |
|---|---|---|
| Jacksonville | January | 13 |
| ElPaso | January | 20 |
After pivoting, the result becomes:
| month | ElPaso | Jacksonville |
|---|---|---|
| January | 20 | 13 |
The month values become the row index, the city values become column names, and the temperature values fill the table cells.
The problem guarantees that the input data is valid for pivoting. That means there is at most one temperature value for each (month, city) pair. If duplicates existed, a normal pivot operation would fail because Pandas would not know which value to place into the cell.
The input size is small because this is an Easy difficulty Pandas problem. The focus is not on algorithmic optimization, but on understanding how to correctly use DataFrame reshaping operations.
An important detail is that the output format shown in the example keeps month as a regular column rather than as the DataFrame index. Since pivot() typically places the pivot key into the index, we must reset the index afterward.
Potential edge cases include:
- only one city in the dataset
- only one month in the dataset
- unordered input rows
- missing combinations of
(month, city)
Pandas handles missing combinations automatically by inserting NaN values in the resulting table.
Approaches
Brute Force Approach
A brute force solution would manually construct the pivot table.
We could first collect all unique months and all unique cities. Then, for every month, we would iterate through all rows of the DataFrame and search for matching city records. Whenever we find a row whose month matches the current month, we place its temperature into the appropriate city column.
This approach works because it explicitly rebuilds the target table cell by cell. However, it is unnecessarily inefficient because it repeatedly scans the entire dataset for every month.
The main issue is that repeated filtering or scanning creates redundant work.
Optimal Approach
The optimal solution uses Pandas' built in pivot() operation.
The key insight is that the problem is exactly a standard pivot-table transformation:
- rows should be indexed by
month - columns should be grouped by
city - values should come from
temperature
Pandas already provides a highly optimized function for this transformation. Instead of manually reconstructing the table, we can directly express the reshaping operation declaratively.
After pivoting, we call reset_index() so that month becomes a normal column again.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(m × n) | O(m × c) | Repeatedly scans rows for each month |
| Optimal | O(n) | O(m × c) | Uses Pandas built in pivot operation |
Here:
nis the number of rowsmis the number of unique monthscis the number of unique cities
Algorithm Walkthrough
- Start with the original
weatherDataFrame containingcity,month, andtemperature. - Call the
pivot()method on the DataFrame.
We use:
index="month"because each row should represent a monthcolumns="city"because each city should become its own columnvalues="temperature"because temperatures are the values stored inside the table
- The result of
pivot()creates a DataFrame where:
- the index is the month
- the columns are city names
- the cells contain temperatures
- Call
reset_index().
This step converts the month index back into a normal DataFrame column so the output matches the required format. 5. Return the transformed DataFrame.
Why it works
The pivot operation guarantees that each unique (month, city) pair maps to exactly one cell in the output table. Since the problem guarantees valid input data, every temperature value is placed into the correct position. The transformation preserves all original information while reorganizing the layout into the required wide-table format.
Python Solution
import pandas as pd
class Solution:
def pivotTable(self, weather: pd.DataFrame) -> pd.DataFrame:
return weather.pivot(
index="month",
columns="city",
values="temperature"
).reset_index()
The implementation is very compact because Pandas already provides the exact transformation we need.
The pivot() function performs the reshaping operation:
index="month"groups rows by monthcolumns="city"creates one column per cityvalues="temperature"fills the table cells with temperatures
After pivoting, month becomes the DataFrame index. The problem expects it to remain a normal column, so we call reset_index() before returning the result.
This solution directly mirrors the conceptual transformation described in the algorithm walkthrough.
Go Solution
package main
import "fmt"
type Record struct {
City string
Month string
Temperature int
}
func pivotTable(weather []Record) []map[string]interface{} {
monthMap := make(map[string]map[string]interface{})
for _, record := range weather {
if _, exists := monthMap[record.Month]; !exists {
monthMap[record.Month] = map[string]interface{}{
"month": record.Month,
}
}
monthMap[record.Month][record.City] = record.Temperature
}
result := make([]map[string]interface{}, 0)
for _, row := range monthMap {
result = append(result, row)
}
return result
}
func main() {
weather := []Record{
{"Jacksonville", "January", 13},
{"ElPaso", "January", 20},
}
fmt.Println(pivotTable(weather))
}
Unlike Python Pandas, Go does not provide a built in DataFrame pivot operation. Therefore, we manually construct the reshaped table using nested maps.
The outer map groups rows by month, while the inner map stores city temperature pairs for that month.
One implementation detail is that Go maps are unordered, so the output ordering may differ unless additional sorting logic is added. Python Pandas automatically preserves a structured tabular representation.
Worked Examples
Example 1
Input:
| city | month | temperature |
|---|---|---|
| Jacksonville | January | 13 |
| Jacksonville | February | 23 |
| Jacksonville | March | 38 |
| Jacksonville | April | 5 |
| Jacksonville | May | 34 |
| ElPaso | January | 20 |
| ElPaso | February | 6 |
| ElPaso | March | 26 |
| ElPaso | April | 2 |
| ElPaso | May | 43 |
Step 1, Original DataFrame
| city | month | temperature |
|---|---|---|
| Jacksonville | January | 13 |
| Jacksonville | February | 23 |
| ... | ... | ... |
Step 2, Apply Pivot
We execute:
weather.pivot(
index="month",
columns="city",
values="temperature"
)
Pandas internally groups rows by month and creates city columns.
Intermediate structure:
| month | ElPaso | Jacksonville |
|---|---|---|
| January | 20 | 13 |
| February | 6 | 23 |
| March | 26 | 38 |
| April | 2 | 5 |
| May | 43 | 34 |
At this point, month is the DataFrame index.
Step 3, Reset Index
After calling:
.reset_index()
Final result:
| month | ElPaso | Jacksonville |
|---|---|---|
| April | 2 | 5 |
| February | 6 | 23 |
| January | 20 | 13 |
| March | 26 | 38 |
| May | 43 | 34 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each row is processed once during pivot construction |
| Space | O(m × c) | The pivot table stores one cell per month-city combination |
The pivot operation processes each input row once to determine its destination cell in the reshaped table. The resulting DataFrame requires storage proportional to the number of unique months multiplied by the number of unique cities.
Test Cases
import pandas as pd
solution = Solution()
# Basic example from the prompt
weather1 = pd.DataFrame({
"city": ["Jacksonville", "Jacksonville", "ElPaso", "ElPaso"],
"month": ["January", "February", "January", "February"],
"temperature": [13, 23, 20, 6]
})
result1 = solution.pivotTable(weather1)
assert "Jacksonville" in result1.columns # city column created
assert "ElPaso" in result1.columns # another city column created
# Single city
weather2 = pd.DataFrame({
"city": ["A", "A"],
"month": ["Jan", "Feb"],
"temperature": [1, 2]
})
result2 = solution.pivotTable(weather2)
assert result2.shape == (2, 2) # month + one city column
# Single month
weather3 = pd.DataFrame({
"city": ["A", "B"],
"month": ["Jan", "Jan"],
"temperature": [5, 10]
})
result3 = solution.pivotTable(weather3)
assert result3.shape == (1, 3) # one row, two city columns
# Unordered input rows
weather4 = pd.DataFrame({
"city": ["B", "A"],
"month": ["Feb", "Jan"],
"temperature": [8, 3]
})
result4 = solution.pivotTable(weather4)
assert set(result4["month"]) == {"Jan", "Feb"} # months preserved
# Missing city-month combinations
weather5 = pd.DataFrame({
"city": ["A"],
"month": ["Jan"],
"temperature": [100]
})
result5 = solution.pivotTable(weather5)
assert result5.loc[0, "A"] == 100 # value placed correctly
Test Case Summary
| Test | Why |
|---|---|
| Basic example | Validates standard pivot transformation |
| Single city | Ensures one-column pivot works correctly |
| Single month | Ensures one-row pivot works correctly |
| Unordered rows | Confirms input order does not matter |
| Missing combinations | Verifies sparse data is handled properly |
Edge Cases
Single City Input
If the dataset contains only one city, the pivoted result should still work correctly and produce exactly one city column. A buggy implementation might incorrectly assume multiple city columns exist or mishandle dimensions during reshaping. Using Pandas pivot() naturally handles this case without additional logic.
Single Month Input
When all temperature records belong to the same month, the output should contain only one row. Some manual implementations may accidentally duplicate rows or fail to group correctly. The pivot operation guarantees that all cities for the same month are merged into a single row.
Missing Month-City Combinations
Not every city must necessarily appear for every month. In such cases, the pivot table contains missing values (NaN) for absent combinations. A naive manual solution might crash or incorrectly initialize missing cells. Pandas handles missing entries automatically and preserves the table structure correctly.
Unordered Input Data
The input rows may appear in arbitrary order. A manual implementation that depends on sorted input could produce incorrect grouping behavior. The pivot operation groups by keys rather than relying on row ordering, so the implementation remains correct regardless of input arrangement.