LeetCode 2877 - Create a DataFrame from List
This problem asks us to create a Pandas DataFrame from a given two dimensional list named studentdata. Each element of studentdata is itself a list containing exactly two values: 1. A student ID 2. The student's age For example: represents four students.
Difficulty: 🟢 Easy
Topics: —
Solution
LeetCode 2877 - Create a DataFrame from List
Problem Understanding
This problem asks us to create a Pandas DataFrame from a given two dimensional list named student_data.
Each element of student_data is itself a list containing exactly two values:
- A student ID
- The student's age
For example:
[
[1, 15],
[2, 11],
[3, 11],
[4, 20]
]
represents four students. The first student has ID 1 and age 15, the second student has ID 2 and age 11, and so on.
The goal is to convert this raw list structure into a Pandas DataFrame with exactly two columns:
student_id
age
The rows must appear in the same order as they do in the original input list.
The output is therefore a tabular representation of the same data, where each inner list becomes a row and the column names are explicitly assigned.
Since this is a Pandas problem rather than a traditional algorithmic problem, there are no significant computational constraints. The task is primarily about understanding how to construct a DataFrame and assign column names correctly.
An important guarantee is that every row contains exactly two values corresponding to the required columns. Because of this guarantee, we do not need to validate row lengths or perform any error handling.
Potential edge cases include an empty input list, a single student record, or multiple students sharing the same age. All of these cases are naturally handled by DataFrame construction.
Approaches
Brute Force Approach
A manual approach would be to iterate through every row of student_data, extract the student ID and age, store them in separate collections, and then build a DataFrame from those collections.
For example, we could create two lists:
student_ids = [1, 2, 3, 4]
ages = [15, 11, 11, 20]
and then construct the DataFrame from a dictionary mapping column names to these lists.
This approach works because every row is processed exactly once and every value is copied into the appropriate column.
However, it performs unnecessary work. The input data is already organized in a structure that Pandas can directly convert into a DataFrame. Extracting values into separate lists only adds extra code and memory usage.
Optimal Approach
The key observation is that Pandas provides a built in constructor that can directly convert a two dimensional list into a DataFrame.
By passing the original data along with the desired column names:
pd.DataFrame(student_data, columns=["student_id", "age"])
Pandas automatically creates the correct table structure.
This is both simpler and more efficient because no manual processing of the rows is required.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) | O(n) | Copies values into separate lists before creating the DataFrame |
| Optimal | O(n) | O(n) | Directly constructs the DataFrame from the original 2D list |
Here, n represents the number of student records.
Algorithm Walkthrough
Optimal Algorithm
- Receive the input 2D list
student_data. - Call the Pandas
DataFrameconstructor usingstudent_dataas the row data. - Specify the column names as
["student_id", "age"]. - Return the resulting DataFrame.
Why it works
Every inner list in student_data contains exactly two values. When Pandas receives the two dimensional list and the corresponding two column names, it maps the first value of each row to student_id and the second value to age. Since rows are processed in order, the resulting DataFrame preserves the original ordering of the input data.
Python Solution
import pandas as pd
from typing import List
def createDataframe(student_data: List[List[int]]) -> pd.DataFrame:
return pd.DataFrame(student_data, columns=["student_id", "age"])
The implementation directly follows the algorithm described above.
The input list is passed to the pd.DataFrame constructor. The columns parameter assigns the required column names in the correct order. Pandas then creates one DataFrame row for each inner list in student_data.
Because the data is already structured appropriately, no iteration or preprocessing is necessary.
Go Solution
LeetCode 2877 is a Pandas specific problem. The platform provides a Python environment and expects a Pandas DataFrame as the return value. As a result, there is no official Go version of this problem because Go does not have a Pandas DataFrame equivalent in the LeetCode environment.
For completeness, the analogous logic in Go would simply store the data in a structure with named fields:
package main
type Student struct {
StudentID int
Age int
}
func createDataframe(studentData [][]int) []Student {
result := make([]Student, 0, len(studentData))
for _, row := range studentData {
result = append(result, Student{
StudentID: row[0],
Age: row[1],
})
}
return result
}
The Go version demonstrates the same transformation concept, converting each input row into a structured record. However, this is not a LeetCode-submittable solution for Problem 2877 because the actual problem specifically requires a Pandas DataFrame.
Worked Examples
Example 1
Input:
[
[1, 15],
[2, 11],
[3, 11],
[4, 20]
]
Pandas receives:
pd.DataFrame(
student_data,
columns=["student_id", "age"]
)
The rows are mapped as follows:
| Input Row | student_id | age |
|---|---|---|
| [1, 15] | 1 | 15 |
| [2, 11] | 2 | 11 |
| [3, 11] | 3 | 11 |
| [4, 20] | 4 | 20 |
Resulting DataFrame:
| student_id | age |
|---|---|
| 1 | 15 |
| 2 | 11 |
| 3 | 11 |
| 4 | 20 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Pandas processes each row once when creating the DataFrame |
| Space | O(n) | The resulting DataFrame stores all input rows |
The constructor must read every row from the input list in order to create the DataFrame structure. Therefore, the runtime grows linearly with the number of student records. The DataFrame itself stores all records, requiring linear space.
Test Cases
import pandas as pd
# Example from the problem statement
result = createDataframe([[1, 15], [2, 11], [3, 11], [4, 20]])
expected = pd.DataFrame(
[[1, 15], [2, 11], [3, 11], [4, 20]],
columns=["student_id", "age"]
)
assert result.equals(expected) # Standard example
# Empty input
result = createDataframe([])
expected = pd.DataFrame([], columns=["student_id", "age"])
assert result.equals(expected) # No students
# Single row
result = createDataframe([[5, 18]])
expected = pd.DataFrame([[5, 18]], columns=["student_id", "age"])
assert result.equals(expected) # One student
# Duplicate ages
result = createDataframe([[1, 10], [2, 10], [3, 10]])
expected = pd.DataFrame(
[[1, 10], [2, 10], [3, 10]],
columns=["student_id", "age"]
)
assert result.equals(expected) # Repeated age values
# Large IDs
result = createDataframe([[1000000, 21], [2000000, 22]])
expected = pd.DataFrame(
[[1000000, 21], [2000000, 22]],
columns=["student_id", "age"]
)
assert result.equals(expected) # Large numeric values
Test Case Summary
| Test | Why |
|---|---|
| Standard example | Verifies normal behavior |
| Empty input | Ensures an empty DataFrame is created correctly |
| Single row | Validates the smallest non empty input |
| Duplicate ages | Confirms repeated values are preserved |
| Large IDs | Ensures larger integers are handled correctly |
Edge Cases
Empty Input List
The input may contain no student records at all:
[]
A common mistake is assuming at least one row exists. Pandas handles this naturally. The implementation still creates a DataFrame with the required column names and zero rows.
Single Student Record
The input may contain exactly one student:
[[5, 18]]
Some implementations incorrectly treat a single row as a one dimensional structure. By directly passing the two dimensional list to the DataFrame constructor, the row is correctly preserved as a single record.
Duplicate Values
Multiple students may have the same age:
[
[1, 11],
[2, 11],
[3, 11]
]
A flawed implementation might accidentally aggregate or deduplicate data. The DataFrame constructor performs no such transformations and preserves every row exactly as provided.
Large Numeric Values
Student IDs may be much larger than those shown in the examples:
[
[1000000, 21],
[2000000, 22]
]
Since the solution simply stores the values in a DataFrame without additional computation, large integer values are preserved correctly and require no special handling.