LeetCode 3338 - Second Highest Salary II
This problem asks us to analyze employee salary data within departments and extract all employees earning the second-highest salary in each department. The input is a table employees with three columns: empid, salary, and dept.
Difficulty: 🟡 Medium
Topics: Database
Solution
Problem Understanding
This problem asks us to analyze employee salary data within departments and extract all employees earning the second-highest salary in each department. The input is a table employees with three columns: emp_id, salary, and dept. Each emp_id is unique, but multiple employees may share the same salary in the same department.
The output should list all employees with the second-highest salary in their department, sorted by emp_id in ascending order. Departments with fewer than two distinct salaries will not contribute any rows to the output. The key constraints include handling duplicate salaries, ensuring all qualifying employees are included, and respecting the ordering requirement.
Important edge cases include: a department with only one employee, departments where multiple employees share the highest or second-highest salary, and departments with consecutive identical salaries that could complicate ranking.
Approaches
The brute-force approach is straightforward: for each department, extract all salaries, sort them in descending order, identify the second-highest salary, and then collect all employees who earn it. While correct, this approach requires multiple nested queries or iterations for every department, making it inefficient for large datasets.
The optimal approach leverages SQL window functions, specifically DENSE_RANK(). By partitioning the table by dept and ordering salaries in descending order, we can assign a rank to each salary within a department. Employees with rank 2 correspond to the second-highest salary. This approach handles duplicates naturally and avoids repeated scans of the data.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n log n) per department | O(n) | Sort salaries per department and select second-highest |
| Optimal | O(n) | O(n) | Use SQL window functions (DENSE_RANK) to rank salaries efficiently |
Algorithm Walkthrough
- Partition employees by department: We logically group the rows by
deptso that salary comparisons happen only within the same department. - Rank salaries using DENSE_RANK: Within each department, order employees by
salarydescending and assign adense_rank. The highest salary gets rank 1, the second-highest gets rank 2.DENSE_RANKhandles ties naturally, giving the same rank to equal salaries. - Filter for rank 2: Once all employees have ranks, select only those whose rank equals 2.
- Order results by emp_id: Finally, sort the filtered results by
emp_idascending to meet the output requirement.
Why it works: DENSE_RANK ensures that all employees sharing the same salary receive the same rank. By selecting rank 2, we precisely capture all employees with the second-highest salary per department. This handles duplicates and single-employee departments correctly, because a department with fewer than two distinct salaries will have no rank 2 assigned.
Python Solution
# Python solution using SQLAlchemy for illustration
from typing import List, Tuple
import sqlite3
def second_highest_salary(conn: sqlite3.Connection) -> List[Tuple[int, str]]:
query = """
SELECT emp_id, dept
FROM (
SELECT emp_id, dept, DENSE_RANK() OVER(PARTITION BY dept ORDER BY salary DESC) AS rnk
FROM employees
) sub
WHERE rnk = 2
ORDER BY emp_id ASC;
"""
cursor = conn.cursor()
cursor.execute(query)
return cursor.fetchall()
This implementation constructs a subquery that computes the DENSE_RANK of salaries per department. The outer query filters for the second-highest rank and orders the results. sqlite3 is used for database interaction, but the query works in any SQL-compliant system.
Go Solution
package main
import (
"database/sql"
_ "github.com/go-sql-driver/mysql"
"log"
)
type Employee struct {
EmpID int
Dept string
}
func SecondHighestSalary(db *sql.DB) ([]Employee, error) {
query := `
SELECT emp_id, dept
FROM (
SELECT emp_id, dept, DENSE_RANK() OVER(PARTITION BY dept ORDER BY salary DESC) AS rnk
FROM employees
) sub
WHERE rnk = 2
ORDER BY emp_id ASC;
`
rows, err := db.Query(query)
if err != nil {
return nil, err
}
defer rows.Close()
var results []Employee
for rows.Next() {
var e Employee
if err := rows.Scan(&e.EmpID, &e.Dept); err != nil {
return nil, err
}
results = append(results, e)
}
return results, nil
}
In Go, we handle the SQL query similarly, but we explicitly iterate over the rows, scanning each result into a struct. Error handling is necessary for database operations. Go requires more boilerplate than Python for database interaction but uses the same underlying SQL logic.
Worked Examples
Example: Sales Department
- Extract employees: emp_id 1-4 with salaries 70000, 80000, 80000, 90000
- Rank salaries descending: 90000 → 1, 80000 → 2, 80000 → 2, 70000 → 3
- Filter rank 2: emp_id 2, 3
- Result sorted by emp_id: 2, 3
Example: IT Department
- Employees: emp_id 5-7 with salaries 55000, 65000, 65000
- Rank salaries descending: 65000 → 1, 65000 → 1, 55000 → 2
- Filter rank 2: emp_id 5
- Sorted result: 5
The same logic applies for Marketing and HR.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each row is processed once for ranking and filtering; no nested loops over departments |
| Space | O(n) | The intermediate table storing ranks has the same size as the original table |
The time complexity is effectively linear in the number of rows because window functions operate efficiently in modern SQL engines. Space complexity is proportional to the table size due to the additional rank column.
Test Cases
# Test cases
import sqlite3
conn = sqlite3.connect(':memory:')
conn.execute("""
CREATE TABLE employees(
emp_id INTEGER PRIMARY KEY,
salary INTEGER,
dept TEXT
)
""")
# Example input
conn.executemany("INSERT INTO employees(emp_id, salary, dept) VALUES (?, ?, ?)", [
(1, 70000, "Sales"),
(2, 80000, "Sales"),
(3, 80000, "Sales"),
(4, 90000, "Sales"),
(5, 55000, "IT"),
(6, 65000, "IT"),
(7, 65000, "IT"),
(8, 50000, "Marketing"),
(9, 55000, "Marketing"),
(10, 55000, "HR"),
])
result = second_highest_salary(conn)
assert result == [(2, "Sales"), (3, "Sales"), (5, "IT"), (8, "Marketing")] # Example output
# Single employee department
conn.execute("INSERT INTO employees(emp_id, salary, dept) VALUES (?, ?, ?)", (11, 100000, "Legal"))
result = second_highest_salary(conn)
assert all(emp[1] != "Legal" for emp in result) # Legal should not appear
# Multiple second-highest salaries
conn.executemany("INSERT INTO employees(emp_id, salary, dept) VALUES (?, ?, ?)", [
(12, 70000, "IT"),
(13, 55000, "IT")
])
result = second_highest_salary(conn)
assert (5, "IT") in result and (13, "IT") in result # Both employees with second-highest salary
| Test | Why |
|---|---|
| Example input | Validates the basic functionality and ranking logic |
| Single employee department | Ensures departments with fewer than two distinct salaries are excluded |
| Multiple second-highest salaries | Checks handling of duplicate salaries at the second-highest rank |
Edge Cases
One edge case is a department with only a single employee. This department has no second-highest salary, and the implementation correctly excludes it because no rank 2 exists.
Another edge case involves multiple employees sharing the highest salary. Without DENSE_RANK, naive ranking could mistakenly treat the second-highest incorrectly. Using DENSE_RANK ensures that all top salaries receive rank 1, and the next distinct salary receives rank 2.
A third edge case is when all employees in a department have the same salary. In this case, there is no second-highest salary, and the query correctly produces no rows for that department. This behavior prevents false positives and aligns with the problem requirement.