LeetCode 3338 - Second Highest Salary II

This problem asks us to analyze employee salary data within departments and extract all employees earning the second-highest salary in each department. The input is a table employees with three columns: empid, salary, and dept.

LeetCode Problem 3338

Difficulty: 🟡 Medium
Topics: Database

Solution

Problem Understanding

This problem asks us to analyze employee salary data within departments and extract all employees earning the second-highest salary in each department. The input is a table employees with three columns: emp_id, salary, and dept. Each emp_id is unique, but multiple employees may share the same salary in the same department.

The output should list all employees with the second-highest salary in their department, sorted by emp_id in ascending order. Departments with fewer than two distinct salaries will not contribute any rows to the output. The key constraints include handling duplicate salaries, ensuring all qualifying employees are included, and respecting the ordering requirement.

Important edge cases include: a department with only one employee, departments where multiple employees share the highest or second-highest salary, and departments with consecutive identical salaries that could complicate ranking.

Approaches

The brute-force approach is straightforward: for each department, extract all salaries, sort them in descending order, identify the second-highest salary, and then collect all employees who earn it. While correct, this approach requires multiple nested queries or iterations for every department, making it inefficient for large datasets.

The optimal approach leverages SQL window functions, specifically DENSE_RANK(). By partitioning the table by dept and ordering salaries in descending order, we can assign a rank to each salary within a department. Employees with rank 2 correspond to the second-highest salary. This approach handles duplicates naturally and avoids repeated scans of the data.

Approach Time Complexity Space Complexity Notes
Brute Force O(n log n) per department O(n) Sort salaries per department and select second-highest
Optimal O(n) O(n) Use SQL window functions (DENSE_RANK) to rank salaries efficiently

Algorithm Walkthrough

  1. Partition employees by department: We logically group the rows by dept so that salary comparisons happen only within the same department.
  2. Rank salaries using DENSE_RANK: Within each department, order employees by salary descending and assign a dense_rank. The highest salary gets rank 1, the second-highest gets rank 2. DENSE_RANK handles ties naturally, giving the same rank to equal salaries.
  3. Filter for rank 2: Once all employees have ranks, select only those whose rank equals 2.
  4. Order results by emp_id: Finally, sort the filtered results by emp_id ascending to meet the output requirement.

Why it works: DENSE_RANK ensures that all employees sharing the same salary receive the same rank. By selecting rank 2, we precisely capture all employees with the second-highest salary per department. This handles duplicates and single-employee departments correctly, because a department with fewer than two distinct salaries will have no rank 2 assigned.

Python Solution

# Python solution using SQLAlchemy for illustration
from typing import List, Tuple
import sqlite3

def second_highest_salary(conn: sqlite3.Connection) -> List[Tuple[int, str]]:
    query = """
    SELECT emp_id, dept
    FROM (
        SELECT emp_id, dept, DENSE_RANK() OVER(PARTITION BY dept ORDER BY salary DESC) AS rnk
        FROM employees
    ) sub
    WHERE rnk = 2
    ORDER BY emp_id ASC;
    """
    cursor = conn.cursor()
    cursor.execute(query)
    return cursor.fetchall()

This implementation constructs a subquery that computes the DENSE_RANK of salaries per department. The outer query filters for the second-highest rank and orders the results. sqlite3 is used for database interaction, but the query works in any SQL-compliant system.

Go Solution

package main

import (
    "database/sql"
    _ "github.com/go-sql-driver/mysql"
    "log"
)

type Employee struct {
    EmpID int
    Dept  string
}

func SecondHighestSalary(db *sql.DB) ([]Employee, error) {
    query := `
        SELECT emp_id, dept
        FROM (
            SELECT emp_id, dept, DENSE_RANK() OVER(PARTITION BY dept ORDER BY salary DESC) AS rnk
            FROM employees
        ) sub
        WHERE rnk = 2
        ORDER BY emp_id ASC;
    `
    rows, err := db.Query(query)
    if err != nil {
        return nil, err
    }
    defer rows.Close()

    var results []Employee
    for rows.Next() {
        var e Employee
        if err := rows.Scan(&e.EmpID, &e.Dept); err != nil {
            return nil, err
        }
        results = append(results, e)
    }
    return results, nil
}

In Go, we handle the SQL query similarly, but we explicitly iterate over the rows, scanning each result into a struct. Error handling is necessary for database operations. Go requires more boilerplate than Python for database interaction but uses the same underlying SQL logic.

Worked Examples

Example: Sales Department

  1. Extract employees: emp_id 1-4 with salaries 70000, 80000, 80000, 90000
  2. Rank salaries descending: 90000 → 1, 80000 → 2, 80000 → 2, 70000 → 3
  3. Filter rank 2: emp_id 2, 3
  4. Result sorted by emp_id: 2, 3

Example: IT Department

  1. Employees: emp_id 5-7 with salaries 55000, 65000, 65000
  2. Rank salaries descending: 65000 → 1, 65000 → 1, 55000 → 2
  3. Filter rank 2: emp_id 5
  4. Sorted result: 5

The same logic applies for Marketing and HR.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each row is processed once for ranking and filtering; no nested loops over departments
Space O(n) The intermediate table storing ranks has the same size as the original table

The time complexity is effectively linear in the number of rows because window functions operate efficiently in modern SQL engines. Space complexity is proportional to the table size due to the additional rank column.

Test Cases

# Test cases
import sqlite3

conn = sqlite3.connect(':memory:')
conn.execute("""
CREATE TABLE employees(
    emp_id INTEGER PRIMARY KEY,
    salary INTEGER,
    dept TEXT
)
""")

# Example input
conn.executemany("INSERT INTO employees(emp_id, salary, dept) VALUES (?, ?, ?)", [
    (1, 70000, "Sales"),
    (2, 80000, "Sales"),
    (3, 80000, "Sales"),
    (4, 90000, "Sales"),
    (5, 55000, "IT"),
    (6, 65000, "IT"),
    (7, 65000, "IT"),
    (8, 50000, "Marketing"),
    (9, 55000, "Marketing"),
    (10, 55000, "HR"),
])

result = second_highest_salary(conn)
assert result == [(2, "Sales"), (3, "Sales"), (5, "IT"), (8, "Marketing")]  # Example output

# Single employee department
conn.execute("INSERT INTO employees(emp_id, salary, dept) VALUES (?, ?, ?)", (11, 100000, "Legal"))
result = second_highest_salary(conn)
assert all(emp[1] != "Legal" for emp in result)  # Legal should not appear

# Multiple second-highest salaries
conn.executemany("INSERT INTO employees(emp_id, salary, dept) VALUES (?, ?, ?)", [
    (12, 70000, "IT"),
    (13, 55000, "IT")
])
result = second_highest_salary(conn)
assert (5, "IT") in result and (13, "IT") in result  # Both employees with second-highest salary
Test Why
Example input Validates the basic functionality and ranking logic
Single employee department Ensures departments with fewer than two distinct salaries are excluded
Multiple second-highest salaries Checks handling of duplicate salaries at the second-highest rank

Edge Cases

One edge case is a department with only a single employee. This department has no second-highest salary, and the implementation correctly excludes it because no rank 2 exists.

Another edge case involves multiple employees sharing the highest salary. Without DENSE_RANK, naive ranking could mistakenly treat the second-highest incorrectly. Using DENSE_RANK ensures that all top salaries receive rank 1, and the next distinct salary receives rank 2.

A third edge case is when all employees in a department have the same salary. In this case, there is no second-highest salary, and the query correctly produces no rows for that department. This behavior prevents false positives and aligns with the problem requirement.