LeetCode 1875 - Group Employees of the Same Salary

This is a SQL database problem where we need to group employees into teams based on salary. The important detail is that a team is defined entirely by salary, meaning every employee in a team must have exactly the same salary, and all employees with the same salary must belong…

LeetCode Problem 1875

Difficulty: 🟡 Medium
Topics: Database

Solution

Problem Understanding

This is a SQL database problem where we need to group employees into teams based on salary. The important detail is that a team is defined entirely by salary, meaning every employee in a team must have exactly the same salary, and all employees with the same salary must belong to the same team.

However, not every salary forms a valid team. A salary group is only considered a team if at least two employees share that salary. If a salary appears exactly once, that employee is excluded from the output entirely.

After identifying valid teams, we must assign a team_id. This ID is not arbitrary. It depends on the ranking of salaries among valid teams only. The team with the smallest salary gets team_id = 1, the second smallest valid salary gets team_id = 2, and so on. Employees with unique salaries do not participate in the ranking process.

The input consists of a single table, Employees, containing:

  • employee_id, a unique identifier for each employee
  • name, the employee's name
  • salary, the employee's salary

The output should include only employees who belong to a valid team, along with:

  • employee_id
  • name
  • salary
  • team_id

The result must be sorted by team_id in ascending order, and then by employee_id in ascending order.

The main challenge is correctly implementing salary grouping and ranking. A naive implementation may incorrectly rank all salaries, including unique salaries, which would produce incorrect team_id values. Another common mistake is assigning different teams to employees who share the same salary, violating the requirement that all employees with the same salary belong to the same team.

An important guarantee in the problem is that employee_id is unique, which means we never have duplicate employee records. Salaries can repeat arbitrarily, and multiple groups may exist.

Approaches

Brute Force Approach

A brute-force approach would examine every employee and compare their salary against every other employee to determine whether a matching salary exists.

For each employee, we could scan the entire table to count how many employees share the same salary. If the count is at least two, the employee belongs to a valid team. After identifying all valid salaries, we would sort them and assign rankings manually. Finally, we would scan the employee list again to assign the corresponding team_id.

This approach works because it explicitly checks all relationships between employees and salaries. However, it performs repeated scans of the table, making it inefficient. If there are n employees, checking salary matches for each employee takes O(n²) time.

Optimal Approach

The key insight is that this problem is fundamentally a grouping problem.

Instead of repeatedly scanning the table, we can group employees by salary once using SQL aggregation. By applying GROUP BY salary, we can count how many employees belong to each salary group.

Only salary groups with COUNT(*) >= 2 qualify as teams. Once valid salary groups are identified, we can rank them using a window function. Since the problem specifies that team IDs are determined by salary order, DENSE_RANK() over salaries in ascending order gives us exactly what we need.

Finally, we join the ranked salary groups back to the original Employees table to retrieve employee information.

This approach is efficient because grouping and ranking are performed once, avoiding repeated scans.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Repeatedly scans employees to count matching salaries
Optimal O(n log n) O(n) Uses grouping and ranking with SQL aggregation

Algorithm Walkthrough

  1. Group employees by salary and count how many employees share each salary. This tells us which salary groups are valid teams.
  2. Filter out salary groups with only one employee. Since teams require at least two members, these salaries should not participate in team creation or ranking.
  3. Rank the remaining salaries in ascending order using DENSE_RANK(). We use DENSE_RANK() because team IDs must be consecutive integers without gaps.
  4. Join the ranked salary groups back with the Employees table. This allows us to retrieve employee details for all team members.
  5. Sort the final result by team_id and then employee_id, as required by the problem statement.

Why it works

The algorithm works because every salary is processed exactly once through grouping. The GROUP BY salary operation guarantees that all employees with the same salary remain together. Filtering with COUNT(*) >= 2 ensures only valid teams are included. Finally, DENSE_RANK() assigns consecutive team IDs based only on valid salary groups, which exactly matches the problem definition.

Python Solution

Although this is a database problem and LeetCode expects SQL, the following Python implementation demonstrates the equivalent logic programmatically.

from typing import List, Dict

class Solution:
    def groupEmployees(self, employees: List[Dict]) -> List[Dict]:
        salary_count = {}

        # Count occurrences of each salary
        for employee in employees:
            salary = employee["salary"]
            salary_count[salary] = salary_count.get(salary, 0) + 1

        # Keep only valid team salaries
        valid_salaries = sorted(
            salary
            for salary, count in salary_count.items()
            if count >= 2
        )

        # Assign team IDs
        salary_to_team = {
            salary: idx + 1
            for idx, salary in enumerate(valid_salaries)
        }

        # Build result
        result = []

        for employee in employees:
            salary = employee["salary"]

            if salary in salary_to_team:
                result.append({
                    "employee_id": employee["employee_id"],
                    "name": employee["name"],
                    "salary": salary,
                    "team_id": salary_to_team[salary]
                })

        # Required ordering
        result.sort(key=lambda x: (x["team_id"], x["employee_id"]))

        return result

The implementation starts by counting how many times each salary appears. This mirrors SQL's GROUP BY salary with COUNT(*).

Next, salaries that appear at least twice are extracted and sorted. Sorting is necessary because team_id depends on salary ranking from lowest to highest.

A mapping from salary to team ID is then created. This allows constant-time lookup when assigning teams to employees.

Finally, the algorithm iterates through employees and includes only those whose salary belongs to a valid team. The result is sorted according to the problem requirements.

SQL Solution

Since LeetCode expects a SQL query, this is the actual submittable answer:

WITH team_salaries AS (
    SELECT
        salary,
        DENSE_RANK() OVER (ORDER BY salary) AS team_id
    FROM Employees
    GROUP BY salary
    HAVING COUNT(*) >= 2
)

SELECT
    e.employee_id,
    e.name,
    e.salary,
    t.team_id
FROM Employees e
JOIN team_salaries t
    ON e.salary = t.salary
ORDER BY t.team_id, e.employee_id;

This solution first computes valid salary groups and ranks them. The final join ensures every employee with a team salary receives the correct team_id.

Go Solution

The following Go implementation demonstrates equivalent programmatic logic.

package main

import (
	"sort"
)

type Employee struct {
	EmployeeID int
	Name       string
	Salary     int
}

type Result struct {
	EmployeeID int
	Name       string
	Salary     int
	TeamID     int
}

func groupEmployees(employees []Employee) []Result {
	salaryCount := make(map[int]int)

	// Count salary frequency
	for _, employee := range employees {
		salaryCount[employee.Salary]++
	}

	// Collect valid salaries
	var validSalaries []int
	for salary, count := range salaryCount {
		if count >= 2 {
			validSalaries = append(validSalaries, salary)
		}
	}

	sort.Ints(validSalaries)

	// Assign team IDs
	salaryToTeam := make(map[int]int)
	for idx, salary := range validSalaries {
		salaryToTeam[salary] = idx + 1
	}

	// Build result
	var result []Result

	for _, employee := range employees {
		teamID, exists := salaryToTeam[employee.Salary]
		if exists {
			result = append(result, Result{
				EmployeeID: employee.EmployeeID,
				Name:       employee.Name,
				Salary:     employee.Salary,
				TeamID:     teamID,
			})
		}
	}

	// Required ordering
	sort.Slice(result, func(i, j int) bool {
		if result[i].TeamID == result[j].TeamID {
			return result[i].EmployeeID < result[j].EmployeeID
		}
		return result[i].TeamID < result[j].TeamID
	})

	return result
}

The Go implementation closely mirrors the Python solution. A map[int]int stores salary frequencies, and another map stores the salary-to-team relationship.

Because Go maps are unordered, salaries must be explicitly sorted before assigning team IDs. The final sorting step uses sort.Slice to satisfy the required output ordering.

Worked Examples

Example 1

Input:

employee_id name salary
2 Meir 3000
3 Michael 3000
7 Addilyn 7400
8 Juan 6100
9 Kannon 7400

First, count salary occurrences:

Salary Count
3000 2
6100 1
7400 2

Next, remove unique salaries:

Valid Salary
3000
7400

Assign team IDs based on salary ranking:

Salary Team ID
3000 1
7400 2

Now build the output:

employee_id name salary team_id
2 Meir 3000 1
3 Michael 3000 1
7 Addilyn 7400 2
9 Kannon 7400 2

Employee 8 is excluded because salary 6100 appears only once.

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Counting salaries is O(n), sorting valid salaries dominates
Space O(n) Hash maps and result storage require additional memory

The dominant cost comes from sorting valid salaries. In the worst case, every salary appears at least twice, so sorting takes O(n log n). All hash map operations are constant time on average.

For the SQL solution, database optimizers typically implement grouping and sorting efficiently using indexes or internal sorting mechanisms.

Test Cases

def group_employees(employees):
    salary_count = {}

    for employee in employees:
        salary = employee["salary"]
        salary_count[salary] = salary_count.get(salary, 0) + 1

    valid_salaries = sorted(
        salary
        for salary, count in salary_count.items()
        if count >= 2
    )

    salary_to_team = {
        salary: idx + 1
        for idx, salary in enumerate(valid_salaries)
    }

    result = []

    for employee in employees:
        if employee["salary"] in salary_to_team:
            result.append({
                "employee_id": employee["employee_id"],
                "name": employee["name"],
                "salary": employee["salary"],
                "team_id": salary_to_team[employee["salary"]]
            })

    result.sort(key=lambda x: (x["team_id"], x["employee_id"]))
    return result

# Provided example
assert group_employees([
    {"employee_id": 2, "name": "Meir", "salary": 3000},
    {"employee_id": 3, "name": "Michael", "salary": 3000},
    {"employee_id": 7, "name": "Addilyn", "salary": 7400},
    {"employee_id": 8, "name": "Juan", "salary": 6100},
    {"employee_id": 9, "name": "Kannon", "salary": 7400},
]) == [
    {"employee_id": 2, "name": "Meir", "salary": 3000, "team_id": 1},
    {"employee_id": 3, "name": "Michael", "salary": 3000, "team_id": 1},
    {"employee_id": 7, "name": "Addilyn", "salary": 7400, "team_id": 2},
    {"employee_id": 9, "name": "Kannon", "salary": 7400, "team_id": 2},
]  # Example case

assert group_employees([
    {"employee_id": 1, "name": "A", "salary": 5000},
]) == []  # Single employee, no team

assert group_employees([
    {"employee_id": 1, "name": "A", "salary": 1000},
    {"employee_id": 2, "name": "B", "salary": 2000},
    {"employee_id": 3, "name": "C", "salary": 3000},
]) == []  # All unique salaries

assert group_employees([
    {"employee_id": 1, "name": "A", "salary": 5000},
    {"employee_id": 2, "name": "B", "salary": 5000},
    {"employee_id": 3, "name": "C", "salary": 5000},
]) == [
    {"employee_id": 1, "name": "A", "salary": 5000, "team_id": 1},
    {"employee_id": 2, "name": "B", "salary": 5000, "team_id": 1},
    {"employee_id": 3, "name": "C", "salary": 5000, "team_id": 1},
]  # All employees same salary

assert group_employees([
    {"employee_id": 1, "name": "A", "salary": 4000},
    {"employee_id": 2, "name": "B", "salary": 4000},
    {"employee_id": 3, "name": "C", "salary": 2000},
    {"employee_id": 4, "name": "D", "salary": 2000},
]) == [
    {"employee_id": 3, "name": "C", "salary": 2000, "team_id": 1},
    {"employee_id": 4, "name": "D", "salary": 2000, "team_id": 1},
    {"employee_id": 1, "name": "A", "salary": 4000, "team_id": 2},
    {"employee_id": 2, "name": "B", "salary": 4000, "team_id": 2},
]  # Ranking by salary
Test Why
Provided example Verifies normal grouping and ranking
Single employee Ensures no team is created
All unique salaries Validates exclusion logic
All same salary Ensures one valid team is formed
Multiple duplicate salaries Verifies salary ranking and ordering

Edge Cases

One important edge case occurs when every employee has a unique salary. A careless implementation might still assign team IDs to each employee. However, the problem explicitly states that unique salaries do not form teams. The implementation handles this by filtering salary groups using count >= 2.

Another edge case appears when multiple employees share exactly one salary. Since there is only one valid salary group, every employee should receive team_id = 1. The algorithm correctly assigns rankings after sorting valid salaries, ensuring the smallest valid salary always starts at one.

A subtle edge case involves salaries that are unique but numerically between valid team salaries. For example, salaries [3000, 5000, 7000] where only 3000 and 7000 repeat. An incorrect implementation might rank 7000 as team 3 because 5000 exists. The correct behavior ignores unique salaries entirely, which is why ranking occurs only after filtering valid salary groups.