LeetCode 2886 - Change Data Type

This problem provides a Pandas DataFrame named students with four columns: | Column | Type | | --- | --- | | studentid | int | | name | object | | age | int | | grade | float | The task is straightforward: the grade column is incorrectly stored as floating-point values, and we…

LeetCode Problem 2886

Difficulty: 🟢 Easy
Topics:

Solution

Problem Understanding

This problem provides a Pandas DataFrame named students with four columns:

Column Type
student_id int
name object
age int
grade float

The task is straightforward: the grade column is incorrectly stored as floating-point values, and we must convert it into integer values.

In other words, the problem asks us to modify the DataFrame so that the grade column changes from a float type to an int type. The actual numeric values do not change conceptually, only the data type changes. For example, 73.0 should become 73, and 87.0 should become 87.

The input is a Pandas DataFrame where each row represents a student and contains their identifier, name, age, and grade. The expected output is the same DataFrame after correcting the grade column type.

Since this is a Pandas problem, the focus is not on algorithmic optimization but rather on correctly using DataFrame operations. The problem guarantees that the grade column contains float values that can safely be converted to integers. This means we do not need to worry about invalid values such as strings or missing data.

An important edge case is an empty DataFrame. A naive implementation should still work without errors and simply return an empty DataFrame with the corrected type. Another consideration is that values are stored as floats like 73.0 rather than arbitrary decimals like 73.5. The problem implicitly guarantees that conversion is valid, so truncation concerns do not apply here.

Approaches

Brute Force Approach

A brute-force solution would iterate through every row of the DataFrame and manually convert each grade value into an integer. One way to do this would be using a loop or applying a function row by row.

For example, we could traverse each row, access the grade column, cast the value using int(), and store the result back into the DataFrame.

This approach is correct because every float value gets explicitly converted into an integer. However, it is inefficient in Pandas because row-by-row operations are slower than vectorized operations. Pandas is designed to work efficiently with entire columns at once.

Optimal Approach

The key insight is that Pandas provides a built-in vectorized method, astype(), for converting an entire column's data type efficiently.

Instead of processing rows individually, we can directly convert the entire grade column:

students["grade"] = students["grade"].astype(int)

This is both cleaner and more idiomatic in Pandas. Internally, Pandas performs the conversion efficiently across the whole column.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(1) Iterates through each row and converts grades manually
Optimal O(n) O(1) Uses Pandas vectorized astype(int) conversion

Even though both approaches have the same asymptotic complexity, the vectorized Pandas solution is significantly faster in practice and much more concise.

Algorithm Walkthrough

  1. Access the grade column in the students DataFrame. This column currently contains floating-point values.
  2. Use Pandas' astype(int) method to convert the entire column into integers. This works column-wise, which is more efficient than converting values individually.
  3. Assign the converted column back to students["grade"] so the DataFrame is updated in place.
  4. Return the modified DataFrame.

Why it works

The algorithm works because astype(int) guarantees that every value in the grade column is converted from a float to an integer type. Since the problem guarantees valid float representations of integers, the transformation preserves the intended grade values while correcting the column's data type.

Python Solution

import pandas as pd

def changeDatatype(students: pd.DataFrame) -> pd.DataFrame:
    students["grade"] = students["grade"].astype(int)
    return students

The implementation is intentionally simple because Pandas already provides the exact operation we need.

First, we access the grade column using students["grade"]. We then call .astype(int) to convert the entire column from floating-point values to integers. Finally, we assign the transformed column back into the DataFrame and return the modified result.

This directly follows the algorithm discussed earlier and leverages Pandas vectorized operations for efficiency and readability.

Go Solution

LeetCode Pandas problems are language-specific and are intended to be solved in Python using Pandas. There is no official Go function signature or Go submission format for this problem because the execution environment expects a Pandas DataFrame operation.

For reference, the conceptual equivalent in Go would involve iterating through records and converting floating-point grades to integers manually:

package main

type Student struct {
	StudentID int
	Name      string
	Age       int
	Grade     float64
}

type ResultStudent struct {
	StudentID int
	Name      string
	Age       int
	Grade     int
}

func changeDatatype(students []Student) []ResultStudent {
	result := make([]ResultStudent, len(students))

	for i, student := range students {
		result[i] = ResultStudent{
			StudentID: student.StudentID,
			Name:      student.Name,
			Age:       student.Age,
			Grade:     int(student.Grade),
		}
	}

	return result
}

The main difference from Python is that Go does not have a built-in DataFrame abstraction like Pandas. Instead, we represent rows as structs and explicitly cast float64 values to int during iteration.

Worked Examples

Example 1

Input DataFrame

student_id name age grade
1 Ava 6 73.0
2 Kate 15 87.0

Step 1: Access the grade column

Current values:

Index grade
0 73.0
1 87.0

Step 2: Convert using astype(int)

After conversion:

Index grade
0 73
1 87

Step 3: Assign back to the DataFrame

Final DataFrame:

student_id name age grade
1 Ava 6 73
2 Kate 15 87

The grade column now has integer values instead of floats.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Every value in the grade column must be processed once
Space O(1) Conversion is performed efficiently with constant auxiliary space

The time complexity is O(n) because Pandas must examine and convert each value in the grade column exactly once. The space complexity is effectively O(1) auxiliary space because we are updating the existing DataFrame column rather than building a separate data structure proportional to input size.

Test Cases

import pandas as pd

def changeDatatype(students: pd.DataFrame) -> pd.DataFrame:
    students["grade"] = students["grade"].astype(int)
    return students

# Example case
df = pd.DataFrame({
    "student_id": [1, 2],
    "name": ["Ava", "Kate"],
    "age": [6, 15],
    "grade": [73.0, 87.0]
})

result = changeDatatype(df)

assert result["grade"].tolist() == [73, 87]  # Example from problem

# Empty DataFrame
df = pd.DataFrame({
    "student_id": [],
    "name": [],
    "age": [],
    "grade": []
})

result = changeDatatype(df)

assert result.empty  # Empty input should remain empty

# Single student
df = pd.DataFrame({
    "student_id": [1],
    "name": ["Bob"],
    "age": [10],
    "grade": [100.0]
})

result = changeDatatype(df)

assert result["grade"].tolist() == [100]  # Single row conversion

# Multiple rows
df = pd.DataFrame({
    "student_id": [1, 2, 3],
    "name": ["A", "B", "C"],
    "age": [7, 8, 9],
    "grade": [50.0, 75.0, 99.0]
})

result = changeDatatype(df)

assert result["grade"].tolist() == [50, 75, 99]  # Multiple conversions

# Verify integer dtype
df = pd.DataFrame({
    "student_id": [1],
    "name": ["Alice"],
    "age": [12],
    "grade": [88.0]
})

result = changeDatatype(df)

assert str(result["grade"].dtype).startswith("int")  # Ensure dtype changed
Test Why
Example input Validates correctness against the provided example
Empty DataFrame Ensures no failure on empty input
Single student Verifies minimal valid input
Multiple rows Confirms conversion across several records
Integer dtype validation Ensures the column type truly changes

Edge Cases

Empty DataFrame

An empty DataFrame may contain the correct columns but no rows. Some implementations fail when assuming data exists. Using astype(int) on an empty column works correctly, so the function safely returns an empty DataFrame without errors.

Single Row Input

A DataFrame with only one student is a minimal valid case. Bugs sometimes occur when implementations incorrectly assume multiple rows exist. Since Pandas column operations work uniformly regardless of size, the conversion behaves identically for one row.

Already Integer-Like Float Values

The grade column contains values such as 73.0 and 87.0, which are floating-point representations of integers. A naive implementation might unnecessarily parse strings or introduce rounding logic. Our solution avoids this complexity by using astype(int), which directly converts valid numeric floats into integers exactly as required.