Difficulty: 🟢 Easy
Topics: Database

Solution

LeetCode 2668 - Find Latest Salaries

Problem Understanding

This problem gives us a database table named Salary that stores employee salary records. Each employee may appear multiple times because older salary records are still kept in the table. The key observation is that salaries are guaranteed to increase every year, which means the largest salary value for an employee must represent their most recent salary.

The table contains the following columns:

Column	Meaning
`emp_id`	Unique employee identifier
`firstname`	Employee first name
`lastname`	Employee last name
`salary`	Employee salary
`department_id`	Department identifier

The primary key is (emp_id, salary), meaning the same employee cannot have duplicate salary values.

The goal is to return exactly one row per employee, specifically the row containing that employee's latest salary. Since salaries only increase over time, the latest salary is simply the maximum salary associated with each emp_id.

The output must include:

emp_id
firstname
lastname
salary
department_id

The result must also be sorted by emp_id in ascending order.

The important detail is that the salary column is stored as a varchar, not an integer. In SQL, this can sometimes create issues because string comparison is lexicographical rather than numeric. For example:

"90000" is greater than "79632"
but "100000" may compare incorrectly against smaller-length strings if not converted properly

Therefore, a robust solution should treat salaries numerically when finding the maximum value.

Several edge cases are important to consider:

Employees with only one salary record
Employees with multiple salary records
Salaries stored as strings
Large salary values
Employees belonging to different departments

The problem guarantees that salaries strictly increase over time, so the maximum salary uniquely identifies the latest record.

Approaches

Brute Force Approach

A brute-force solution would examine every employee independently and scan the entire table to find their highest salary.

The algorithm would work like this:

Collect all distinct employee IDs.
For each employee ID, scan every row in the table.
Keep track of the maximum salary encountered for that employee.
Output the row containing that maximum salary.

This approach is correct because every employee's latest salary is the largest salary value associated with that employee. By checking all rows for every employee, we guarantee that we eventually find the maximum.

However, this solution is inefficient because the table is repeatedly scanned. If there are n rows and k employees, the complexity becomes O(n * k). In the worst case, where almost every row belongs to a different employee, this approaches O(n²).

Optimal Approach

The key insight is that SQL databases are already optimized for grouping and aggregation operations.

Instead of repeatedly scanning the table, we can:

Group rows by emp_id
Compute the maximum salary for each employee
Join the result back to the original table to retrieve the remaining columns

This approach only processes the data once during aggregation, making it significantly more efficient and much cleaner.

Because the latest salary is always the maximum salary, the aggregation directly identifies the correct row for each employee.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n²)	O(1)	Repeatedly scans the table for every employee
Optimal	O(n)	O(n)	Uses grouping and aggregation to find maximum salaries efficiently

Algorithm Walkthrough

Optimal SQL Algorithm

First, group all rows by emp_id.

Grouping allows us to process each employee independently. Every employee may have multiple salary records, but we only want the latest one. 2. For each employee group, compute the maximum salary.

Since salaries increase over time, the highest salary must correspond to the newest record. 3. Convert salary values to numeric form if necessary.

The salary column is stored as a string (varchar). Numeric comparison ensures correct ordering. 4. Store the result as a temporary derived table.

This table contains:

emp_id
max_salary

Join the derived table back with the original Salary table.

This allows us to retrieve the full row containing:

first name
last name
department ID
salary

Match rows using both:

emp_id
maximum salary

Sort the final output by emp_id ascending.

Why it works

The algorithm relies on the guarantee that employee salaries always increase over time. Therefore, the latest salary is always the maximum salary for that employee. Grouping by employee and selecting the maximum salary uniquely identifies the desired row. Joining back to the original table restores the remaining employee details.

Python Solution

Although this is a Database problem and the expected submission is SQL, the following Python implementation demonstrates the same logic programmatically.

from typing import List, Dict

class Solution:
    def findLatestSalaries(self, salary_records: List[Dict]) -> List[Dict]:
        latest_salary = {}

        for record in salary_records:
            emp_id = record["emp_id"]
            salary = int(record["salary"])

            if (
                emp_id not in latest_salary
                or salary > int(latest_salary[emp_id]["salary"])
            ):
                latest_salary[emp_id] = record

        result = list(latest_salary.values())
        result.sort(key=lambda record: record["emp_id"])

        return result

The implementation uses a dictionary keyed by emp_id.

As we iterate through each record:

We convert the salary to an integer
We compare it against the current stored salary
If the new salary is larger, we replace the stored record

At the end:

The dictionary contains the latest salary record for every employee
We convert the dictionary values into a list
We sort the result by emp_id

This mirrors the same logic used in the SQL aggregation solution.

Go Solution

package main

import (
	"sort"
	"strconv"
)

type SalaryRecord struct {
	EmpID        int
	Firstname    string
	Lastname     string
	Salary       string
	DepartmentID string
}

func findLatestSalaries(records []SalaryRecord) []SalaryRecord {
	latestSalary := make(map[int]SalaryRecord)

	for _, record := range records {
		currentSalary, _ := strconv.Atoi(record.Salary)

		existing, exists := latestSalary[record.EmpID]

		if !exists {
			latestSalary[record.EmpID] = record
			continue
		}

		existingSalary, _ := strconv.Atoi(existing.Salary)

		if currentSalary > existingSalary {
			latestSalary[record.EmpID] = record
		}
	}

	result := make([]SalaryRecord, 0, len(latestSalary))

	for _, record := range latestSalary {
		result = append(result, record)
	}

	sort.Slice(result, func(i, j int) bool {
		return result[i].EmpID < result[j].EmpID
	})

	return result
}

The Go implementation follows the same overall strategy as the Python version.

Some Go-specific details include:

Using a map[int]SalaryRecord for employee lookup
Using strconv.Atoi() to convert salary strings into integers
Using sort.Slice() to sort the final result
Explicit struct definitions for record storage

Go requires more explicit type handling than Python, but the underlying algorithm remains identical.

SQL Solution

SELECT 
    s.emp_id,
    s.firstname,
    s.lastname,
    s.salary,
    s.department_id
FROM Salary s
JOIN (
    SELECT 
        emp_id,
        MAX(CAST(salary AS UNSIGNED)) AS max_salary
    FROM Salary
    GROUP BY emp_id
) latest
ON s.emp_id = latest.emp_id
AND CAST(s.salary AS UNSIGNED) = latest.max_salary
ORDER BY s.emp_id;

This solution first computes the maximum salary for every employee and then joins the result back to the original table to retrieve the corresponding employee information.

The CAST(... AS UNSIGNED) ensures salaries are compared numerically rather than lexicographically.

Worked Examples

Example 1

Input table:

emp_id	firstname	lastname	salary	department_id
1	Todd	Wilson	110000	D1006
1	Todd	Wilson	106119	D1006
2	Justin	Simon	128922	D1005
2	Justin	Simon	130000	D1005
3	Kelly	Rosario	42689	D1002
4	Patricia	Powell	162825	D1004
4	Patricia	Powell	170000	D1004
5	Sherry	Golden	44101	D1002
6	Natasha	Swanson	79632	D1005
6	Natasha	Swanson	90000	D1005

Step 1, Group by Employee

emp_id	Salaries
1	110000, 106119
2	128922, 130000
3	42689
4	162825, 170000
5	44101
6	79632, 90000

Step 2, Find Maximum Salary

emp_id	max_salary
1	110000
2	130000
3	42689
4	170000
5	44101
6	90000

Step 3, Join Back to Original Table

Matching rows:

emp_id	firstname	lastname	salary	department_id
1	Todd	Wilson	110000	D1006
2	Justin	Simon	130000	D1005
3	Kelly	Rosario	42689	D1002
4	Patricia	Powell	170000	D1004
5	Sherry	Golden	44101	D1002
6	Natasha	Swanson	90000	D1005

Step 4, Sort by Employee ID

Final output:

emp_id	firstname	lastname	salary	department_id
1	Todd	Wilson	110000	D1006
2	Justin	Simon	130000	D1005
3	Kelly	Rosario	42689	D1002
4	Patricia	Powell	170000	D1004
5	Sherry	Golden	44101	D1002
6	Natasha	Swanson	90000	D1005

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Each row is processed once during aggregation
Space	O(n)	The grouped result stores one entry per employee

The SQL aggregation scans the table once to compute maximum salaries. The join operation is also efficient because databases optimize joins using indexes and internal execution plans. The auxiliary storage contains at most one row per employee.

Test Cases

def find_latest_salaries(records):
    latest_salary = {}

    for record in records:
        emp_id = record["emp_id"]
        salary = int(record["salary"])

        if (
            emp_id not in latest_salary
            or salary > int(latest_salary[emp_id]["salary"])
        ):
            latest_salary[emp_id] = record

    result = list(latest_salary.values())
    result.sort(key=lambda x: x["emp_id"])

    return result

# Example case from problem statement
records = [
    {"emp_id": 1, "firstname": "Todd", "lastname": "Wilson", "salary": "110000", "department_id": "D1006"},
    {"emp_id": 1, "firstname": "Todd", "lastname": "Wilson", "salary": "106119", "department_id": "D1006"},
]

result = find_latest_salaries(records)

assert result[0]["salary"] == "110000"  # chooses maximum salary

# Single employee with one record
records = [
    {"emp_id": 3, "firstname": "Kelly", "lastname": "Rosario", "salary": "42689", "department_id": "D1002"},
]

result = find_latest_salaries(records)

assert len(result) == 1  # single row remains unchanged
assert result[0]["salary"] == "42689"

# Multiple employees
records = [
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "50000", "department_id": "D1"},
    {"emp_id": 2, "firstname": "C", "lastname": "D", "salary": "70000", "department_id": "D2"},
]

result = find_latest_salaries(records)

assert len(result) == 2  # both employees included

# Salary comparison must be numeric
records = [
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "100000", "department_id": "D1"},
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "99999", "department_id": "D1"},
]

result = find_latest_salaries(records)

assert result[0]["salary"] == "100000"  # numeric comparison works correctly

# Larger salary appears later
records = [
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "40000", "department_id": "D1"},
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "90000", "department_id": "D1"},
]

result = find_latest_salaries(records)

assert result[0]["salary"] == "90000"  # latest salary selected

# Output ordering by emp_id
records = [
    {"emp_id": 5, "firstname": "A", "lastname": "B", "salary": "50000", "department_id": "D1"},
    {"emp_id": 1, "firstname": "C", "lastname": "D", "salary": "60000", "department_id": "D2"},
]

result = find_latest_salaries(records)

assert result[0]["emp_id"] == 1  # sorted ascending
assert result[1]["emp_id"] == 5

Test Summary

Test	Why
Example input	Validates standard functionality
Single record employee	Ensures lone entries are handled correctly
Multiple employees	Confirms grouping logic
Numeric salary comparison	Prevents lexicographical comparison bugs
Increasing salary sequence	Ensures maximum salary is selected
Sorted output	Verifies ordering requirement

Edge Cases

Employees With Only One Salary Record

Some employees appear exactly once in the table. A naive implementation might incorrectly assume every employee has multiple rows and accidentally exclude single-record employees.

This implementation handles the case naturally because the first encountered record becomes the current maximum salary for that employee.

Salary Stored as String

The salary column is a varchar, which creates a subtle but important issue. String comparison behaves differently from numeric comparison.

For example:

"90000" > "79632" works correctly
but "100000" compared lexicographically can produce incorrect results

The implementation avoids this issue by explicitly converting salary values to integers before comparison.

Records Not Sorted by Time

The rows are not guaranteed to appear in chronological order. A naive solution that assumes the last row is the newest would fail.

This implementation does not rely on input ordering. Instead, it always computes the maximum salary numerically, guaranteeing correctness regardless of row order.

Different employees may have identical salary amounts. A flawed grouping strategy could accidentally merge employees incorrectly.

The implementation groups strictly by emp_id, ensuring every employee is processed independently even if salary values overlap.

LeetCode 2668 - Find Latest Salaries

Solution

LeetCode 2668 - Find Latest Salaries

Problem Understanding

Approaches

Brute Force Approach

Optimal Approach

Approach Comparison

Algorithm Walkthrough

Optimal SQL Algorithm

Why it works

Python Solution

Go Solution

SQL Solution

Worked Examples

Example 1

Step 1, Group by Employee

Step 2, Find Maximum Salary

Step 3, Join Back to Original Table

Step 4, Sort by Employee ID

Complexity Analysis

Test Cases

Test Summary

Edge Cases

Employees With Only One Salary Record

Salary Stored as String

Records Not Sorted by Time

Multiple Employees Sharing the Same Salary