LeetCode 176 - Second Highest Salary
This problem asks us to retrieve the second highest distinct salary from the Employee table. The key word here is distinct. We are not looking for the second row after sorting salaries, we are looking for the second unique salary value.
Difficulty: 🟡 Medium
Topics: Database
Solution
Problem Understanding
This problem asks us to retrieve the second highest distinct salary from the Employee table.
The key word here is distinct. We are not looking for the second row after sorting salaries, we are looking for the second unique salary value. This distinction matters because duplicate salaries should only count once.
For example, if the salaries are:
[100, 200, 200, 300]
The distinct salary values are:
[100, 200, 300]
The second highest distinct salary is 200, not the second row after sorting.
The input is a database table named Employee with two columns:
| Column | Meaning |
|---|---|
id |
Unique employee identifier |
salary |
Employee salary |
The expected output is a single-column table:
+---------------------+
| SecondHighestSalary |
+---------------------+
| value or null |
+---------------------+
If there is no second highest distinct salary, meaning there is only one unique salary value or the table is empty, we must return null.
A few important observations emerge immediately:
- We only care about distinct salary values
- We need the second highest, not the second row
- If fewer than two distinct salaries exist, we return
null - The output column name must be exactly
SecondHighestSalary
Because this is a database problem, the challenge is mostly about expressing the logic cleanly in SQL.
Important edge cases include duplicate salaries, tables with only one employee, and cases where all employees have the same salary. A naive implementation that simply sorts and picks the second row would fail in these situations.
Approaches
Brute Force Approach
A brute-force approach would first retrieve all salaries, remove duplicates manually, sort them in descending order, and then pick the second value if it exists.
In SQL terms, this could mean fetching all rows, performing application-side processing, or writing a query that unnecessarily materializes extra intermediate results.
This approach works because sorting all unique salaries guarantees that the second element corresponds to the second highest salary. However, it is inefficient because we process more information than necessary and rely on external logic.
Optimal Approach
The key observation is that SQL already provides mechanisms for sorting, deduplication, and limiting rows.
We can first extract distinct salaries, sort them in descending order, and then retrieve the second entry using LIMIT with an offset.
However, there is a subtle issue: if no second salary exists, the query must still return a row containing null.
To handle this elegantly, we wrap the query inside a scalar subquery. If the subquery finds no second value, SQL automatically returns NULL.
A clean solution looks like this:
SELECT (
SELECT DISTINCT salary
FROM Employee
ORDER BY salary DESC
LIMIT 1 OFFSET 1
) AS SecondHighestSalary;
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n log n) | O(n) | Extracts salaries, removes duplicates, sorts manually |
| Optimal | O(n log n) | O(n) | Uses SQL sorting and distinct handling directly |
Although both approaches involve sorting and therefore have similar asymptotic complexity, the optimal solution is cleaner, database-native, and requires far less manual logic.
Algorithm Walkthrough
- Select all salary values from the
Employeetable. - Remove duplicate salary values using
DISTINCT. This ensures repeated salaries do not incorrectly affect ranking. - Sort the remaining unique salaries in descending order so the highest salary appears first.
- Skip the first result using
OFFSET 1. This removes the maximum salary from consideration. - Use
LIMIT 1to retrieve exactly one row, which represents the second highest salary. - Wrap the query in a scalar subquery and alias the result as
SecondHighestSalary. If no second salary exists, SQL automatically returnsNULL.
Why it works
The correctness relies on the fact that sorting distinct salaries in descending order produces a ranked ordering of unique salary values. Since duplicates are removed first, the second item in this ordering must be the second highest distinct salary. If the ordering contains fewer than two elements, the scalar subquery naturally evaluates to NULL, which matches the problem requirement.
Python Solution
LeetCode Database problems use SQL rather than executable Python code. Since the requested template includes Python, the closest equivalent implementation is shown below to demonstrate the same algorithmic logic.
from typing import List, Optional
class Solution:
def secondHighestSalary(self, salaries: List[int]) -> Optional[int]:
distinct_salaries = sorted(set(salaries), reverse=True)
if len(distinct_salaries) < 2:
return None
return distinct_salaries[1]
The implementation begins by converting the salary list into a set, which removes duplicates. We then sort the resulting unique values in descending order so the largest salaries appear first.
After sorting, we check whether at least two distinct salary values exist. If not, the function returns None, matching the problem requirement of null.
If two or more unique salaries exist, the second element of the sorted list is returned.
For the actual LeetCode submission, the SQL solution is:
SELECT (
SELECT DISTINCT salary
FROM Employee
ORDER BY salary DESC
LIMIT 1 OFFSET 1
) AS SecondHighestSalary;
Go Solution
Similarly, Go is not used for LeetCode Database problems, but here is the equivalent algorithmic implementation.
package main
import "sort"
func secondHighestSalary(salaries []int) *int {
distinctSet := make(map[int]bool)
for _, salary := range salaries {
distinctSet[salary] = true
}
distinctSalaries := make([]int, 0, len(distinctSet))
for salary := range distinctSet {
distinctSalaries = append(distinctSalaries, salary)
}
sort.Sort(sort.Reverse(sort.IntSlice(distinctSalaries)))
if len(distinctSalaries) < 2 {
return nil
}
result := distinctSalaries[1]
return &result
}
The Go implementation mirrors the Python logic. A map is used to simulate a set because Go does not provide a built-in set type. After deduplication, the salaries are sorted in descending order.
The main Go-specific difference is handling nil versus values. Since the absence of a second salary corresponds to null, the function returns a pointer to an integer (*int). Returning nil represents the missing result cleanly.
Worked Examples
Example 1
Input:
[100, 200, 300]
Step-by-step execution
| Step | State |
|---|---|
| Original salaries | [100, 200, 300] |
| Remove duplicates | {100, 200, 300} |
| Sort descending | [300, 200, 100] |
| Take second value | 200 |
Output:
200
Example 2
Input:
[100]
Step-by-step execution
| Step | State |
|---|---|
| Original salaries | [100] |
| Remove duplicates | {100} |
| Sort descending | [100] |
| Check length | Only one salary exists |
| Return | None |
Output:
null
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Removing duplicates is O(n), sorting dominates the runtime |
| Space | O(n) | Extra space is used for distinct salary storage |
The dominant operation is sorting the unique salary values. Even though duplicate removal is linear, sorting requires O(n log n) time in the worst case. Additional space is needed to store the unique salary values.
For the SQL implementation, database indexing and execution engines may optimize performance internally, but conceptually the query still performs sorting over distinct salary values.
Test Cases
def second_highest_salary(salaries):
distinct_salaries = sorted(set(salaries), reverse=True)
if len(distinct_salaries) < 2:
return None
return distinct_salaries[1]
assert second_highest_salary([100, 200, 300]) == 200 # Basic example
assert second_highest_salary([100]) is None # Single salary
assert second_highest_salary([100, 100]) is None # Duplicate salaries only
assert second_highest_salary([100, 200, 200, 300]) == 200 # Duplicates ignored
assert second_highest_salary([50, 50, 40, 30]) == 40 # Repeated max salary
assert second_highest_salary([10, 20]) == 10 # Exactly two salaries
assert second_highest_salary([]) is None # Empty input
assert second_highest_salary([500, 400, 300, 200, 100]) == 400 # Multiple values
Test Summary
| Test | Why |
|---|---|
[100, 200, 300] |
Validates standard case |
[100] |
Tests missing second salary |
[100, 100] |
Ensures duplicates do not count |
[100, 200, 200, 300] |
Verifies distinct logic |
[50, 50, 40, 30] |
Checks repeated maximum salary |
[10, 20] |
Tests smallest valid case |
[] |
Tests empty input |
[500, 400, 300, 200, 100] |
Tests larger ordering |
Edge Cases
Only One Distinct Salary
A common mistake is assuming the table always contains at least two salary values. If there is only one unique salary, such as:
[100]
or
[100, 100]
there is no second highest salary. The implementation correctly returns NULL in SQL and None in Python.
Duplicate Salaries
Duplicates can easily break naive implementations. Consider:
[100, 200, 200, 300]
If we simply sort and take the second row, we would incorrectly return 200 for the wrong reason, or fail on other cases like [300, 300, 200]. By applying DISTINCT first, duplicates are removed before ranking.
Empty Table
Although not explicitly shown in the examples, an empty table is another important case. The scalar subquery returns NULL naturally because no second salary exists. This behavior aligns perfectly with the problem requirements.
All Employees Share the Same Salary
When every employee earns the same amount:
[500, 500, 500]
there is still only one distinct salary value. A naive solution that counts rows rather than unique values would fail. The DISTINCT keyword guarantees correctness here.