LeetCode 2668 - Find Latest Salaries
This problem gives us a database table named Salary that stores employee salary records. Each employee may appear multiple times because older salary records are still kept in the table.
Difficulty: 🟢 Easy
Topics: Database
Solution
LeetCode 2668 - Find Latest Salaries
Problem Understanding
This problem gives us a database table named Salary that stores employee salary records. Each employee may appear multiple times because older salary records are still kept in the table. The key observation is that salaries are guaranteed to increase every year, which means the largest salary value for an employee must represent their most recent salary.
The table contains the following columns:
| Column | Meaning |
|---|---|
emp_id |
Unique employee identifier |
firstname |
Employee first name |
lastname |
Employee last name |
salary |
Employee salary |
department_id |
Department identifier |
The primary key is (emp_id, salary), meaning the same employee cannot have duplicate salary values.
The goal is to return exactly one row per employee, specifically the row containing that employee's latest salary. Since salaries only increase over time, the latest salary is simply the maximum salary associated with each emp_id.
The output must include:
emp_idfirstnamelastnamesalarydepartment_id
The result must also be sorted by emp_id in ascending order.
The important detail is that the salary column is stored as a varchar, not an integer. In SQL, this can sometimes create issues because string comparison is lexicographical rather than numeric. For example:
"90000"is greater than"79632"- but
"100000"may compare incorrectly against smaller-length strings if not converted properly
Therefore, a robust solution should treat salaries numerically when finding the maximum value.
Several edge cases are important to consider:
- Employees with only one salary record
- Employees with multiple salary records
- Salaries stored as strings
- Large salary values
- Employees belonging to different departments
The problem guarantees that salaries strictly increase over time, so the maximum salary uniquely identifies the latest record.
Approaches
Brute Force Approach
A brute-force solution would examine every employee independently and scan the entire table to find their highest salary.
The algorithm would work like this:
- Collect all distinct employee IDs.
- For each employee ID, scan every row in the table.
- Keep track of the maximum salary encountered for that employee.
- Output the row containing that maximum salary.
This approach is correct because every employee's latest salary is the largest salary value associated with that employee. By checking all rows for every employee, we guarantee that we eventually find the maximum.
However, this solution is inefficient because the table is repeatedly scanned. If there are n rows and k employees, the complexity becomes O(n * k). In the worst case, where almost every row belongs to a different employee, this approaches O(n²).
Optimal Approach
The key insight is that SQL databases are already optimized for grouping and aggregation operations.
Instead of repeatedly scanning the table, we can:
- Group rows by
emp_id - Compute the maximum salary for each employee
- Join the result back to the original table to retrieve the remaining columns
This approach only processes the data once during aggregation, making it significantly more efficient and much cleaner.
Because the latest salary is always the maximum salary, the aggregation directly identifies the correct row for each employee.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Repeatedly scans the table for every employee |
| Optimal | O(n) | O(n) | Uses grouping and aggregation to find maximum salaries efficiently |
Algorithm Walkthrough
Optimal SQL Algorithm
- First, group all rows by
emp_id.
Grouping allows us to process each employee independently. Every employee may have multiple salary records, but we only want the latest one. 2. For each employee group, compute the maximum salary.
Since salaries increase over time, the highest salary must correspond to the newest record. 3. Convert salary values to numeric form if necessary.
The salary column is stored as a string (varchar). Numeric comparison ensures correct ordering.
4. Store the result as a temporary derived table.
This table contains:
emp_idmax_salary
- Join the derived table back with the original
Salarytable.
This allows us to retrieve the full row containing:
- first name
- last name
- department ID
- salary
- Match rows using both:
emp_id- maximum salary
- Sort the final output by
emp_idascending.
Why it works
The algorithm relies on the guarantee that employee salaries always increase over time. Therefore, the latest salary is always the maximum salary for that employee. Grouping by employee and selecting the maximum salary uniquely identifies the desired row. Joining back to the original table restores the remaining employee details.
Python Solution
Although this is a Database problem and the expected submission is SQL, the following Python implementation demonstrates the same logic programmatically.
from typing import List, Dict
class Solution:
def findLatestSalaries(self, salary_records: List[Dict]) -> List[Dict]:
latest_salary = {}
for record in salary_records:
emp_id = record["emp_id"]
salary = int(record["salary"])
if (
emp_id not in latest_salary
or salary > int(latest_salary[emp_id]["salary"])
):
latest_salary[emp_id] = record
result = list(latest_salary.values())
result.sort(key=lambda record: record["emp_id"])
return result
The implementation uses a dictionary keyed by emp_id.
As we iterate through each record:
- We convert the salary to an integer
- We compare it against the current stored salary
- If the new salary is larger, we replace the stored record
At the end:
- The dictionary contains the latest salary record for every employee
- We convert the dictionary values into a list
- We sort the result by
emp_id
This mirrors the same logic used in the SQL aggregation solution.
Go Solution
package main
import (
"sort"
"strconv"
)
type SalaryRecord struct {
EmpID int
Firstname string
Lastname string
Salary string
DepartmentID string
}
func findLatestSalaries(records []SalaryRecord) []SalaryRecord {
latestSalary := make(map[int]SalaryRecord)
for _, record := range records {
currentSalary, _ := strconv.Atoi(record.Salary)
existing, exists := latestSalary[record.EmpID]
if !exists {
latestSalary[record.EmpID] = record
continue
}
existingSalary, _ := strconv.Atoi(existing.Salary)
if currentSalary > existingSalary {
latestSalary[record.EmpID] = record
}
}
result := make([]SalaryRecord, 0, len(latestSalary))
for _, record := range latestSalary {
result = append(result, record)
}
sort.Slice(result, func(i, j int) bool {
return result[i].EmpID < result[j].EmpID
})
return result
}
The Go implementation follows the same overall strategy as the Python version.
Some Go-specific details include:
- Using a
map[int]SalaryRecordfor employee lookup - Using
strconv.Atoi()to convert salary strings into integers - Using
sort.Slice()to sort the final result - Explicit struct definitions for record storage
Go requires more explicit type handling than Python, but the underlying algorithm remains identical.
SQL Solution
SELECT
s.emp_id,
s.firstname,
s.lastname,
s.salary,
s.department_id
FROM Salary s
JOIN (
SELECT
emp_id,
MAX(CAST(salary AS UNSIGNED)) AS max_salary
FROM Salary
GROUP BY emp_id
) latest
ON s.emp_id = latest.emp_id
AND CAST(s.salary AS UNSIGNED) = latest.max_salary
ORDER BY s.emp_id;
This solution first computes the maximum salary for every employee and then joins the result back to the original table to retrieve the corresponding employee information.
The CAST(... AS UNSIGNED) ensures salaries are compared numerically rather than lexicographically.
Worked Examples
Example 1
Input table:
| emp_id | firstname | lastname | salary | department_id |
|---|---|---|---|---|
| 1 | Todd | Wilson | 110000 | D1006 |
| 1 | Todd | Wilson | 106119 | D1006 |
| 2 | Justin | Simon | 128922 | D1005 |
| 2 | Justin | Simon | 130000 | D1005 |
| 3 | Kelly | Rosario | 42689 | D1002 |
| 4 | Patricia | Powell | 162825 | D1004 |
| 4 | Patricia | Powell | 170000 | D1004 |
| 5 | Sherry | Golden | 44101 | D1002 |
| 6 | Natasha | Swanson | 79632 | D1005 |
| 6 | Natasha | Swanson | 90000 | D1005 |
Step 1, Group by Employee
| emp_id | Salaries |
|---|---|
| 1 | 110000, 106119 |
| 2 | 128922, 130000 |
| 3 | 42689 |
| 4 | 162825, 170000 |
| 5 | 44101 |
| 6 | 79632, 90000 |
Step 2, Find Maximum Salary
| emp_id | max_salary |
|---|---|
| 1 | 110000 |
| 2 | 130000 |
| 3 | 42689 |
| 4 | 170000 |
| 5 | 44101 |
| 6 | 90000 |
Step 3, Join Back to Original Table
Matching rows:
| emp_id | firstname | lastname | salary | department_id |
|---|---|---|---|---|
| 1 | Todd | Wilson | 110000 | D1006 |
| 2 | Justin | Simon | 130000 | D1005 |
| 3 | Kelly | Rosario | 42689 | D1002 |
| 4 | Patricia | Powell | 170000 | D1004 |
| 5 | Sherry | Golden | 44101 | D1002 |
| 6 | Natasha | Swanson | 90000 | D1005 |
Step 4, Sort by Employee ID
Final output:
| emp_id | firstname | lastname | salary | department_id |
|---|---|---|---|---|
| 1 | Todd | Wilson | 110000 | D1006 |
| 2 | Justin | Simon | 130000 | D1005 |
| 3 | Kelly | Rosario | 42689 | D1002 |
| 4 | Patricia | Powell | 170000 | D1004 |
| 5 | Sherry | Golden | 44101 | D1002 |
| 6 | Natasha | Swanson | 90000 | D1005 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each row is processed once during aggregation |
| Space | O(n) | The grouped result stores one entry per employee |
The SQL aggregation scans the table once to compute maximum salaries. The join operation is also efficient because databases optimize joins using indexes and internal execution plans. The auxiliary storage contains at most one row per employee.
Test Cases
def find_latest_salaries(records):
latest_salary = {}
for record in records:
emp_id = record["emp_id"]
salary = int(record["salary"])
if (
emp_id not in latest_salary
or salary > int(latest_salary[emp_id]["salary"])
):
latest_salary[emp_id] = record
result = list(latest_salary.values())
result.sort(key=lambda x: x["emp_id"])
return result
# Example case from problem statement
records = [
{"emp_id": 1, "firstname": "Todd", "lastname": "Wilson", "salary": "110000", "department_id": "D1006"},
{"emp_id": 1, "firstname": "Todd", "lastname": "Wilson", "salary": "106119", "department_id": "D1006"},
]
result = find_latest_salaries(records)
assert result[0]["salary"] == "110000" # chooses maximum salary
# Single employee with one record
records = [
{"emp_id": 3, "firstname": "Kelly", "lastname": "Rosario", "salary": "42689", "department_id": "D1002"},
]
result = find_latest_salaries(records)
assert len(result) == 1 # single row remains unchanged
assert result[0]["salary"] == "42689"
# Multiple employees
records = [
{"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "50000", "department_id": "D1"},
{"emp_id": 2, "firstname": "C", "lastname": "D", "salary": "70000", "department_id": "D2"},
]
result = find_latest_salaries(records)
assert len(result) == 2 # both employees included
# Salary comparison must be numeric
records = [
{"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "100000", "department_id": "D1"},
{"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "99999", "department_id": "D1"},
]
result = find_latest_salaries(records)
assert result[0]["salary"] == "100000" # numeric comparison works correctly
# Larger salary appears later
records = [
{"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "40000", "department_id": "D1"},
{"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "90000", "department_id": "D1"},
]
result = find_latest_salaries(records)
assert result[0]["salary"] == "90000" # latest salary selected
# Output ordering by emp_id
records = [
{"emp_id": 5, "firstname": "A", "lastname": "B", "salary": "50000", "department_id": "D1"},
{"emp_id": 1, "firstname": "C", "lastname": "D", "salary": "60000", "department_id": "D2"},
]
result = find_latest_salaries(records)
assert result[0]["emp_id"] == 1 # sorted ascending
assert result[1]["emp_id"] == 5
Test Summary
| Test | Why |
|---|---|
| Example input | Validates standard functionality |
| Single record employee | Ensures lone entries are handled correctly |
| Multiple employees | Confirms grouping logic |
| Numeric salary comparison | Prevents lexicographical comparison bugs |
| Increasing salary sequence | Ensures maximum salary is selected |
| Sorted output | Verifies ordering requirement |
Edge Cases
Employees With Only One Salary Record
Some employees appear exactly once in the table. A naive implementation might incorrectly assume every employee has multiple rows and accidentally exclude single-record employees.
This implementation handles the case naturally because the first encountered record becomes the current maximum salary for that employee.
Salary Stored as String
The salary column is a varchar, which creates a subtle but important issue. String comparison behaves differently from numeric comparison.
For example:
"90000">"79632"works correctly- but
"100000"compared lexicographically can produce incorrect results
The implementation avoids this issue by explicitly converting salary values to integers before comparison.
Records Not Sorted by Time
The rows are not guaranteed to appear in chronological order. A naive solution that assumes the last row is the newest would fail.
This implementation does not rely on input ordering. Instead, it always computes the maximum salary numerically, guaranteeing correctness regardless of row order.
Multiple Employees Sharing the Same Salary
Different employees may have identical salary amounts. A flawed grouping strategy could accidentally merge employees incorrectly.
The implementation groups strictly by emp_id, ensuring every employee is processed independently even if salary values overlap.