LeetCode 2668 - Find Latest Salaries

This problem gives us a database table named Salary that stores employee salary records. Each employee may appear multiple times because older salary records are still kept in the table.

LeetCode Problem 2668

Difficulty: 🟢 Easy
Topics: Database

Solution

LeetCode 2668 - Find Latest Salaries

Problem Understanding

This problem gives us a database table named Salary that stores employee salary records. Each employee may appear multiple times because older salary records are still kept in the table. The key observation is that salaries are guaranteed to increase every year, which means the largest salary value for an employee must represent their most recent salary.

The table contains the following columns:

Column Meaning
emp_id Unique employee identifier
firstname Employee first name
lastname Employee last name
salary Employee salary
department_id Department identifier

The primary key is (emp_id, salary), meaning the same employee cannot have duplicate salary values.

The goal is to return exactly one row per employee, specifically the row containing that employee's latest salary. Since salaries only increase over time, the latest salary is simply the maximum salary associated with each emp_id.

The output must include:

  • emp_id
  • firstname
  • lastname
  • salary
  • department_id

The result must also be sorted by emp_id in ascending order.

The important detail is that the salary column is stored as a varchar, not an integer. In SQL, this can sometimes create issues because string comparison is lexicographical rather than numeric. For example:

  • "90000" is greater than "79632"
  • but "100000" may compare incorrectly against smaller-length strings if not converted properly

Therefore, a robust solution should treat salaries numerically when finding the maximum value.

Several edge cases are important to consider:

  • Employees with only one salary record
  • Employees with multiple salary records
  • Salaries stored as strings
  • Large salary values
  • Employees belonging to different departments

The problem guarantees that salaries strictly increase over time, so the maximum salary uniquely identifies the latest record.

Approaches

Brute Force Approach

A brute-force solution would examine every employee independently and scan the entire table to find their highest salary.

The algorithm would work like this:

  1. Collect all distinct employee IDs.
  2. For each employee ID, scan every row in the table.
  3. Keep track of the maximum salary encountered for that employee.
  4. Output the row containing that maximum salary.

This approach is correct because every employee's latest salary is the largest salary value associated with that employee. By checking all rows for every employee, we guarantee that we eventually find the maximum.

However, this solution is inefficient because the table is repeatedly scanned. If there are n rows and k employees, the complexity becomes O(n * k). In the worst case, where almost every row belongs to a different employee, this approaches O(n²).

Optimal Approach

The key insight is that SQL databases are already optimized for grouping and aggregation operations.

Instead of repeatedly scanning the table, we can:

  1. Group rows by emp_id
  2. Compute the maximum salary for each employee
  3. Join the result back to the original table to retrieve the remaining columns

This approach only processes the data once during aggregation, making it significantly more efficient and much cleaner.

Because the latest salary is always the maximum salary, the aggregation directly identifies the correct row for each employee.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) Repeatedly scans the table for every employee
Optimal O(n) O(n) Uses grouping and aggregation to find maximum salaries efficiently

Algorithm Walkthrough

Optimal SQL Algorithm

  1. First, group all rows by emp_id.

Grouping allows us to process each employee independently. Every employee may have multiple salary records, but we only want the latest one. 2. For each employee group, compute the maximum salary.

Since salaries increase over time, the highest salary must correspond to the newest record. 3. Convert salary values to numeric form if necessary.

The salary column is stored as a string (varchar). Numeric comparison ensures correct ordering. 4. Store the result as a temporary derived table.

This table contains:

  • emp_id
  • max_salary
  1. Join the derived table back with the original Salary table.

This allows us to retrieve the full row containing:

  • first name
  • last name
  • department ID
  • salary
  1. Match rows using both:
  • emp_id
  • maximum salary
  1. Sort the final output by emp_id ascending.

Why it works

The algorithm relies on the guarantee that employee salaries always increase over time. Therefore, the latest salary is always the maximum salary for that employee. Grouping by employee and selecting the maximum salary uniquely identifies the desired row. Joining back to the original table restores the remaining employee details.

Python Solution

Although this is a Database problem and the expected submission is SQL, the following Python implementation demonstrates the same logic programmatically.

from typing import List, Dict

class Solution:
    def findLatestSalaries(self, salary_records: List[Dict]) -> List[Dict]:
        latest_salary = {}

        for record in salary_records:
            emp_id = record["emp_id"]
            salary = int(record["salary"])

            if (
                emp_id not in latest_salary
                or salary > int(latest_salary[emp_id]["salary"])
            ):
                latest_salary[emp_id] = record

        result = list(latest_salary.values())
        result.sort(key=lambda record: record["emp_id"])

        return result

The implementation uses a dictionary keyed by emp_id.

As we iterate through each record:

  • We convert the salary to an integer
  • We compare it against the current stored salary
  • If the new salary is larger, we replace the stored record

At the end:

  • The dictionary contains the latest salary record for every employee
  • We convert the dictionary values into a list
  • We sort the result by emp_id

This mirrors the same logic used in the SQL aggregation solution.

Go Solution

package main

import (
	"sort"
	"strconv"
)

type SalaryRecord struct {
	EmpID        int
	Firstname    string
	Lastname     string
	Salary       string
	DepartmentID string
}

func findLatestSalaries(records []SalaryRecord) []SalaryRecord {
	latestSalary := make(map[int]SalaryRecord)

	for _, record := range records {
		currentSalary, _ := strconv.Atoi(record.Salary)

		existing, exists := latestSalary[record.EmpID]

		if !exists {
			latestSalary[record.EmpID] = record
			continue
		}

		existingSalary, _ := strconv.Atoi(existing.Salary)

		if currentSalary > existingSalary {
			latestSalary[record.EmpID] = record
		}
	}

	result := make([]SalaryRecord, 0, len(latestSalary))

	for _, record := range latestSalary {
		result = append(result, record)
	}

	sort.Slice(result, func(i, j int) bool {
		return result[i].EmpID < result[j].EmpID
	})

	return result
}

The Go implementation follows the same overall strategy as the Python version.

Some Go-specific details include:

  • Using a map[int]SalaryRecord for employee lookup
  • Using strconv.Atoi() to convert salary strings into integers
  • Using sort.Slice() to sort the final result
  • Explicit struct definitions for record storage

Go requires more explicit type handling than Python, but the underlying algorithm remains identical.

SQL Solution

SELECT 
    s.emp_id,
    s.firstname,
    s.lastname,
    s.salary,
    s.department_id
FROM Salary s
JOIN (
    SELECT 
        emp_id,
        MAX(CAST(salary AS UNSIGNED)) AS max_salary
    FROM Salary
    GROUP BY emp_id
) latest
ON s.emp_id = latest.emp_id
AND CAST(s.salary AS UNSIGNED) = latest.max_salary
ORDER BY s.emp_id;

This solution first computes the maximum salary for every employee and then joins the result back to the original table to retrieve the corresponding employee information.

The CAST(... AS UNSIGNED) ensures salaries are compared numerically rather than lexicographically.

Worked Examples

Example 1

Input table:

emp_id firstname lastname salary department_id
1 Todd Wilson 110000 D1006
1 Todd Wilson 106119 D1006
2 Justin Simon 128922 D1005
2 Justin Simon 130000 D1005
3 Kelly Rosario 42689 D1002
4 Patricia Powell 162825 D1004
4 Patricia Powell 170000 D1004
5 Sherry Golden 44101 D1002
6 Natasha Swanson 79632 D1005
6 Natasha Swanson 90000 D1005

Step 1, Group by Employee

emp_id Salaries
1 110000, 106119
2 128922, 130000
3 42689
4 162825, 170000
5 44101
6 79632, 90000

Step 2, Find Maximum Salary

emp_id max_salary
1 110000
2 130000
3 42689
4 170000
5 44101
6 90000

Step 3, Join Back to Original Table

Matching rows:

emp_id firstname lastname salary department_id
1 Todd Wilson 110000 D1006
2 Justin Simon 130000 D1005
3 Kelly Rosario 42689 D1002
4 Patricia Powell 170000 D1004
5 Sherry Golden 44101 D1002
6 Natasha Swanson 90000 D1005

Step 4, Sort by Employee ID

Final output:

emp_id firstname lastname salary department_id
1 Todd Wilson 110000 D1006
2 Justin Simon 130000 D1005
3 Kelly Rosario 42689 D1002
4 Patricia Powell 170000 D1004
5 Sherry Golden 44101 D1002
6 Natasha Swanson 90000 D1005

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each row is processed once during aggregation
Space O(n) The grouped result stores one entry per employee

The SQL aggregation scans the table once to compute maximum salaries. The join operation is also efficient because databases optimize joins using indexes and internal execution plans. The auxiliary storage contains at most one row per employee.

Test Cases

def find_latest_salaries(records):
    latest_salary = {}

    for record in records:
        emp_id = record["emp_id"]
        salary = int(record["salary"])

        if (
            emp_id not in latest_salary
            or salary > int(latest_salary[emp_id]["salary"])
        ):
            latest_salary[emp_id] = record

    result = list(latest_salary.values())
    result.sort(key=lambda x: x["emp_id"])

    return result

# Example case from problem statement
records = [
    {"emp_id": 1, "firstname": "Todd", "lastname": "Wilson", "salary": "110000", "department_id": "D1006"},
    {"emp_id": 1, "firstname": "Todd", "lastname": "Wilson", "salary": "106119", "department_id": "D1006"},
]

result = find_latest_salaries(records)

assert result[0]["salary"] == "110000"  # chooses maximum salary

# Single employee with one record
records = [
    {"emp_id": 3, "firstname": "Kelly", "lastname": "Rosario", "salary": "42689", "department_id": "D1002"},
]

result = find_latest_salaries(records)

assert len(result) == 1  # single row remains unchanged
assert result[0]["salary"] == "42689"

# Multiple employees
records = [
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "50000", "department_id": "D1"},
    {"emp_id": 2, "firstname": "C", "lastname": "D", "salary": "70000", "department_id": "D2"},
]

result = find_latest_salaries(records)

assert len(result) == 2  # both employees included

# Salary comparison must be numeric
records = [
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "100000", "department_id": "D1"},
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "99999", "department_id": "D1"},
]

result = find_latest_salaries(records)

assert result[0]["salary"] == "100000"  # numeric comparison works correctly

# Larger salary appears later
records = [
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "40000", "department_id": "D1"},
    {"emp_id": 1, "firstname": "A", "lastname": "B", "salary": "90000", "department_id": "D1"},
]

result = find_latest_salaries(records)

assert result[0]["salary"] == "90000"  # latest salary selected

# Output ordering by emp_id
records = [
    {"emp_id": 5, "firstname": "A", "lastname": "B", "salary": "50000", "department_id": "D1"},
    {"emp_id": 1, "firstname": "C", "lastname": "D", "salary": "60000", "department_id": "D2"},
]

result = find_latest_salaries(records)

assert result[0]["emp_id"] == 1  # sorted ascending
assert result[1]["emp_id"] == 5

Test Summary

Test Why
Example input Validates standard functionality
Single record employee Ensures lone entries are handled correctly
Multiple employees Confirms grouping logic
Numeric salary comparison Prevents lexicographical comparison bugs
Increasing salary sequence Ensures maximum salary is selected
Sorted output Verifies ordering requirement

Edge Cases

Employees With Only One Salary Record

Some employees appear exactly once in the table. A naive implementation might incorrectly assume every employee has multiple rows and accidentally exclude single-record employees.

This implementation handles the case naturally because the first encountered record becomes the current maximum salary for that employee.

Salary Stored as String

The salary column is a varchar, which creates a subtle but important issue. String comparison behaves differently from numeric comparison.

For example:

  • "90000" > "79632" works correctly
  • but "100000" compared lexicographically can produce incorrect results

The implementation avoids this issue by explicitly converting salary values to integers before comparison.

Records Not Sorted by Time

The rows are not guaranteed to appear in chronological order. A naive solution that assumes the last row is the newest would fail.

This implementation does not rely on input ordering. Instead, it always computes the maximum salary numerically, guaranteeing correctness regardless of row order.

Multiple Employees Sharing the Same Salary

Different employees may have identical salary amounts. A flawed grouping strategy could accidentally merge employees incorrectly.

The implementation groups strictly by emp_id, ensuring every employee is processed independently even if salary values overlap.