LeetCode 618 - Students Report By Geography

The problem asks us to transform a table of student names and their continents into a pivoted report, where each continent becomes a column and the student names appear alphabetically under their respective continents.

LeetCode Problem 618

Difficulty: 🔴 Hard
Topics: Database

Solution

Problem Understanding

The problem asks us to transform a table of student names and their continents into a pivoted report, where each continent becomes a column and the student names appear alphabetically under their respective continents. The input is a Student table containing two columns: name and continent. This table may include duplicate entries, which should not affect the pivot result except that each row represents a student.

The expected output is a table with three columns: America, Asia, and Europe. Under each column, the names of students from that continent are listed in alphabetical order. Rows are aligned such that if one continent has more students than another, the shorter columns have NULL values for missing rows.

Constraints tell us that America will have at least as many students as the other continents, which simplifies row alignment, but the follow-up asks for a more general solution where continent counts are unknown. Edge cases include empty continents, duplicates, and varying numbers of students per continent.

Approaches

The brute-force approach is to manually query each continent, sort the students alphabetically, and then join the results row by row. While this works logically, it requires multiple self-joins or manual indexing to align the rows, which is cumbersome and difficult to scale when continent counts are unknown.

The optimal solution uses window functions with ROW_NUMBER() to assign a sequential index to each student per continent, sorted alphabetically. After indexing, a full outer join (or left joins if one continent is guaranteed to have the most students) aligns students by their row numbers. This guarantees correct alphabetical order and row alignment, while easily extending to unknown student counts for each continent.

Approach Time Complexity Space Complexity Notes
Brute Force O(n log n) per continent + O(n) joins O(n) Sort and manually align, cumbersome with unknown continent sizes
Optimal O(n log n) O(n) Use ROW_NUMBER() and pivot join for dynamic alignment

Algorithm Walkthrough

  1. Use a ROW_NUMBER() window function to assign an index to each student within their continent group, sorted alphabetically. This gives a unique row number for each student per continent.
  2. Separate the students into three derived tables (or CTEs) by continent. Each table will have the student's name and their row index.
  3. Perform a full outer join on the row index across all three continent tables. This aligns students in the same row index together.
  4. Select the name columns from each continent table and alias them as America, Asia, and Europe.
  5. Order the final result by the row index to maintain alignment.

Why it works: The ROW_NUMBER() guarantees that students are sorted alphabetically within their continent. Joining on the row index ensures that the first student in each continent aligns in the first row, the second student in the second row, and so on. Missing values are automatically filled with NULL during the join.

Python Solution

class Solution:
    def pivotStudents(self, conn) -> None:
        """
        Execute SQL on the given connection to pivot students by continent.
        """
        query = """
        WITH ranked AS (
            SELECT
                name,
                continent,
                ROW_NUMBER() OVER (PARTITION BY continent ORDER BY name) AS rn
            FROM Student
        )
        SELECT
            a.name AS America,
            b.name AS Asia,
            c.name AS Europe
        FROM
            (SELECT name, rn FROM ranked WHERE continent = 'America') a
        LEFT JOIN
            (SELECT name, rn FROM ranked WHERE continent = 'Asia') b
        ON a.rn = b.rn
        LEFT JOIN
            (SELECT name, rn FROM ranked WHERE continent = 'Europe') c
        ON a.rn = c.rn
        ORDER BY a.rn;
        """
        with conn.cursor() as cursor:
            cursor.execute(query)
            result = cursor.fetchall()
            for row in result:
                print(row)

In this Python implementation, a Common Table Expression (CTE) ranked is used to assign row numbers. Separate subqueries for each continent are left-joined on the row number. The ORDER BY a.rn ensures that the rows are correctly aligned.

Go Solution

package main

import (
    "database/sql"
    "fmt"
    _ "github.com/go-sql-driver/mysql"
)

func PivotStudents(db *sql.DB) error {
    query := `
    WITH ranked AS (
        SELECT
            name,
            continent,
            ROW_NUMBER() OVER (PARTITION BY continent ORDER BY name) AS rn
        FROM Student
    )
    SELECT
        a.name AS America,
        b.name AS Asia,
        c.name AS Europe
    FROM
        (SELECT name, rn FROM ranked WHERE continent = 'America') a
    LEFT JOIN
        (SELECT name, rn FROM ranked WHERE continent = 'Asia') b
    ON a.rn = b.rn
    LEFT JOIN
        (SELECT name, rn FROM ranked WHERE continent = 'Europe') c
    ON a.rn = c.rn
    ORDER BY a.rn;
    `
    rows, err := db.Query(query)
    if err != nil {
        return err
    }
    defer rows.Close()
    for rows.Next() {
        var america, asia, europe sql.NullString
        if err := rows.Scan(&america, &asia, &europe); err != nil {
            return err
        }
        fmt.Println(america.String, asia.String, europe.String)
    }
    return rows.Err()
}

In Go, sql.NullString is used to safely handle NULL values. The logic mirrors Python, and the WITH CTE combined with LEFT JOIN aligns rows correctly.

Worked Examples

Input Table

name continent
Jane America
Pascal Europe
Xi Asia
Jack America

Step 1: Assign row numbers

name continent rn
Jack America 1
Jane America 2
Xi Asia 1
Pascal Europe 1

Step 2: Separate by continent

America: (Jack, 1), (Jane, 2)

Asia: (Xi, 1)

Europe: (Pascal, 1)

Step 3: Join on rn

rn America Asia Europe
1 Jack Xi Pascal
2 Jane NULL NULL

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Sorting students alphabetically per continent dominates
Space O(n) Storing row numbers and intermediate tables per continent

The CTE with ROW_NUMBER() and joins requires memory proportional to the number of students, and the sort operation within each partition is O(n log n).

Test Cases

# Provided example
assert pivotStudents([("Jane","America"),("Pascal","Europe"),("Xi","Asia"),("Jack","America")]) == [("Jack","Xi","Pascal"),("Jane",None,None)]

# Empty table
assert pivotStudents([]) == []

# One student per continent
assert pivotStudents([("Alice","America"),("Bob","Asia"),("Charlie","Europe")]) == [("Alice","Bob","Charlie")]

# Multiple students, duplicates
assert pivotStudents([("Tom","America"),("Tom","America"),("Jerry","Asia")]) == [("Tom","Jerry",None),("Tom",None,None)]
Test Why
Provided example Validates basic pivot with unequal lengths
Empty table Ensures code handles no data
One student per continent Validates minimal non-empty case
Multiple students, duplicates Ensures duplicates are handled correctly

Edge Cases

One edge case is when a continent has no students. Using LEFT JOIN or FULL OUTER JOIN ensures missing rows are returned as NULL without error.

Another edge case is duplicate student names. Since ROW_NUMBER() operates on the combination of continent and alphabetical order, duplicates are assigned distinct row numbers, preventing collisions.

A third edge case is when America does not have the most students, as allowed in the follow-up. The optimal solution can extend to a dynamic pivot using FULL OUTER JOIN and a row number max over all continents, ensuring correct alignment regardless of the largest continent.