LeetCode 2890 - Reshape Data: Melt

This problem asks us to reshape a dataset from a wide format to a long format, also known as “melting” in data analysis. In the input DataFrame, each row represents a product, and each column beyond the first represents quarterly sales (quarter1 to quarter4).

LeetCode Problem 2890

Difficulty: 🟢 Easy
Topics:

Solution

Problem Understanding

This problem asks us to reshape a dataset from a wide format to a long format, also known as “melting” in data analysis. In the input DataFrame, each row represents a product, and each column beyond the first represents quarterly sales (quarter_1 to quarter_4). The goal is to transform this data so that each row contains the product name, the quarter, and the corresponding sales figure. Essentially, we are converting multiple quarter columns into a single column (quarter) while keeping the sales values in another column (sales).

The input guarantees a well-formed table with one product per row and four quarters. Important constraints are that every product has exactly four quarters of sales data, and there are no missing or null values. Edge cases that could trip up naive implementations include an empty DataFrame, a DataFrame with only one product, or the need to maintain the order of products and quarters in the output.

The expected output has n × 4 rows if the input has n products. Each product repeats four times, once for each quarter, and the sales column matches the original value.

Approaches

A brute-force approach is to iterate through each row and manually construct the output by looping over each quarter column. For every product, you create a new row for each quarter and append it to a result array. This works because it explicitly enumerates all required rows, but it is verbose and repetitive, especially for large datasets with many columns. Its time complexity is proportional to the number of products times the number of quarters.

The optimal approach leverages a built-in data reshaping function. In Python, pandas provides the melt function, which handles this exact transformation efficiently. In Go, we can mimic this behavior by iterating over each row and column systematically, but the insight remains the same: each product generates multiple rows corresponding to its quarter columns.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * m) O(n * m) Manually construct each output row by looping over rows and columns
Optimal O(n * m) O(n * m) Use pandas melt or systematic iteration to reshape the table efficiently

Algorithm Walkthrough

  1. Initialize an empty result list to hold the reshaped rows. Each row will contain three fields: product, quarter, and sales.
  2. Iterate over each row in the input DataFrame. Extract the product name from the first column.
  3. For each quarter column (quarter_1 to quarter_4), read the corresponding sales value.
  4. Append a new row to the result list, containing the product name, the quarter name, and the sales value.
  5. After processing all rows and quarters, convert the result list into the expected DataFrame format with columns [product, quarter, sales].
  6. Return the reshaped DataFrame.

Why it works: This algorithm ensures that each original row produces exactly one row per quarter, preserving all sales data and correctly mapping quarter names to sales values. The invariant is that every (product, quarter) pair in the input has a corresponding row in the output.

Python Solution

import pandas as pd

class Solution:
    def reshapeSalesData(self, report: pd.DataFrame) -> pd.DataFrame:
        # Use pandas melt function to reshape the DataFrame from wide to long
        reshaped = report.melt(id_vars=['product'], 
                               value_vars=['quarter_1', 'quarter_2', 'quarter_3', 'quarter_4'],
                               var_name='quarter',
                               value_name='sales')
        return reshaped

The Python solution uses pandas melt, specifying id_vars to keep product as the identifier, value_vars to indicate which columns to melt into a single column, var_name to name the new column containing quarter names, and value_name for the sales values. This abstracts away manual iteration and ensures correctness for any number of products.

Go Solution

package main

type ReportRow struct {
	Product string
	Quarter string
	Sales   int
}

func reshapeSalesData(report [][]interface{}) []ReportRow {
	var result []ReportRow
	for _, row := range report {
		product := row[0].(string)
		for i := 1; i < len(row); i++ {
			quarter := "quarter_" + string('0'+i)
			sales := row[i].(int)
			result = append(result, ReportRow{
				Product: product,
				Quarter: quarter,
				Sales:   sales,
			})
		}
	}
	return result
}

In Go, we handle the input as a 2D slice of interfaces to mimic a DataFrame. We iterate over each row and each quarter column, building a slice of ReportRow structs. The logic mirrors the Python version but is more explicit due to Go's lack of a built-in melt function.

Worked Examples

For the input:

+-------------+-----------+-----------+-----------+-----------+
| product     | quarter_1 | quarter_2 | quarter_3 | quarter_4 |
+-------------+-----------+-----------+-----------+-----------+
| Umbrella    | 417       | 224       | 379       | 611       |
| SleepingBag | 800       | 936       | 93        | 875       |
+-------------+-----------+-----------+-----------+-----------+

Step-by-step reshaping:

Step product quarter sales result list after step
1 Umbrella quarter_1 417 [(Umbrella, quarter_1, 417)]
2 Umbrella quarter_2 224 [(Umbrella, quarter_1, 417), (Umbrella, quarter_2, 224)]
3 Umbrella quarter_3 379 [..., (Umbrella, quarter_3, 379)]
4 Umbrella quarter_4 611 [..., (Umbrella, quarter_4, 611)]
5 SleepingBag quarter_1 800 [..., (SleepingBag, quarter_1, 800)]
6 SleepingBag quarter_2 936 [..., (SleepingBag, quarter_2, 936)]
7 SleepingBag quarter_3 93 [..., (SleepingBag, quarter_3, 93)]
8 SleepingBag quarter_4 875 [..., (SleepingBag, quarter_4, 875)]

Final DataFrame contains 8 rows, exactly as expected.

Complexity Analysis

Measure Complexity Explanation
Time O(n * m) Each product row and each quarter column is visited once, n rows × m quarters
Space O(n * m) A new reshaped structure is created with one row per product-quarter pair

The time complexity is linear in the total number of cells. Space complexity is proportional to the output size, which must store all reshaped rows.

Test Cases

import pandas as pd

# Provided example
df1 = pd.DataFrame({
    "product": ["Umbrella", "SleepingBag"],
    "quarter_1": [417, 800],
    "quarter_2": [224, 936],
    "quarter_3": [379, 93],
    "quarter_4": [611, 875]
})
expected1 = pd.DataFrame({
    "product": ["Umbrella","SleepingBag","Umbrella","SleepingBag","Umbrella","SleepingBag","Umbrella","SleepingBag"],
    "quarter": ["quarter_1","quarter_1","quarter_2","quarter_2","quarter_3","quarter_3","quarter_4","quarter_4"],
    "sales": [417,800,224,936,379,93,611,875]
})
assert Solution().reshapeSalesData(df1).equals(expected1)  # typical case

# Single product
df2 = pd.DataFrame({"product": ["Tent"], "quarter_1": [100], "quarter_2": [200], "quarter_3": [300], "quarter_4": [400]})
expected2 = pd.DataFrame({"product": ["Tent"]*4, "quarter": ["quarter_1","quarter_2","quarter_3","quarter_4"], "sales": [100,200,300,400]})
assert Solution().reshapeSalesData(df2).equals(expected2)  # single product

# Empty DataFrame
df3 = pd.DataFrame({"product": [], "quarter_1": [], "quarter_2": [], "quarter_3": [], "quarter_4": []})
expected3 = pd.DataFrame({"product": [], "quarter": [], "sales": []})
assert Solution().reshapeSalesData(df3).equals(expected3)  # empty input

# All zeros
df4 = pd.DataFrame({"product": ["ItemA","ItemB"], "quarter_1": [0,0], "quarter_2": [0,0], "quarter_3": [0,0], "quarter_4": [0,0]})
expected4 = pd.DataFrame({"product": ["ItemA","ItemB"]*4, "quarter": ["quarter_1","quarter_1","quarter_2","quarter_2","quarter_3","quarter_3","quarter_4