LeetCode 3057 - Employees Project Allocation

This problem asks us to identify employees whose assigned project workload is greater than the average workload of employees within their own team. We are given two database tables: The Project table stores information about project assignments.

LeetCode Problem 3057

Difficulty: 🔴 Hard
Topics: Database

Solution

Problem Understanding

This problem asks us to identify employees whose assigned project workload is greater than the average workload of employees within their own team.

We are given two database tables:

The Project table stores information about project assignments. Each row tells us:

  • which project an employee is working on,
  • which employee is assigned,
  • and the workload value associated with that assignment.

The Employees table stores employee metadata, specifically:

  • the employee's name,
  • and the team the employee belongs to.

The goal is to compare every employee's workload against the average workload of employees in the same team.

To do that correctly, we must first associate each project row with its employee's team. Since the team information exists only in the Employees table, we need a join between the two tables using employee_id.

After grouping employees by team, we compute the average workload for each team. Then we keep only those project assignments where:

employee workload > average workload of that employee's team

The output must include:

  • employee_id
  • project_id
  • employee_name
  • project_workload

The results must be sorted by:

  1. employee_id
  2. project_id

both in ascending order.

An important detail is that the comparison is strictly greater than the team average. If an employee's workload equals the average exactly, they must not be included.

The constraints are not explicitly listed, but this is a database problem intended for SQL optimization. That means we should assume the dataset may be large enough that repeated recalculation of averages would be inefficient. Efficient aggregation and joins are therefore important.

Several edge cases are important:

  • Teams containing only one employee. In that case, the employee's workload equals the team average, so they should not appear.
  • Multiple employees sharing the same team average.
  • Employees whose workload is exactly equal to the team average.
  • Teams with many employees and multiple projects.
  • Ordering correctness when multiple rows satisfy the condition.

Approaches

Brute Force Approach

A brute force solution would process every project assignment individually.

For each employee-project row, we could:

  1. Find the employee's team.
  2. Scan all employees in that same team.
  3. Compute the average workload for that team.
  4. Compare the current workload against the computed average.
  5. Include the row if it exceeds the average.

This approach is correct because every row independently computes the exact team average before making the comparison.

However, it is inefficient because the same team averages are recalculated repeatedly. If a team has many employees, the algorithm repeatedly scans identical data for every row belonging to that team.

This repeated aggregation creates unnecessary work and does not scale well.

Optimal Approach

The key observation is that the average workload for a team is identical for every employee in that team.

Instead of recomputing it repeatedly, we should compute each team's average workload once, store the result, and reuse it.

In SQL, this is naturally handled using either:

  • a Common Table Expression (CTE),
  • a subquery,
  • or a window function.

The optimal solution works in two phases:

  1. Compute average workload grouped by team.
  2. Join that result back to the original rows and filter workloads greater than the team average.

This avoids repeated scans and makes the query efficient and scalable.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) Recomputes team averages repeatedly
Optimal O(n) O(t) Computes each team average once, where t is number of teams

Algorithm Walkthrough

  1. Join the Project and Employees tables using employee_id.

This step associates every project assignment with the employee's team and name. Without this join, we cannot determine which team average should be used. 2. Group rows by team and compute the average workload.

For each team, calculate:

$$\text{average workload} = \frac{\text{sum of workloads}}{\text{number of employees}}$$

This produces one average value per team. 3. Store these averages in a temporary result set.

A Common Table Expression named something like team_avg is ideal because it keeps the query readable and avoids recomputation. 4. Join the original project rows with the computed team averages.

Each employee-project row now has access to:

  • employee name,
  • team,
  • workload,
  • and the average workload of the team.
  1. Filter rows where:

$$\text{workload} > \text{team average}$$

Only employees exceeding their team's average remain. 6. Select the required columns.

The output should include:

  • employee_id
  • project_id
  • employee_name
  • project_workload
  1. Sort the result.

Order by:

  • employee_id ASC
  • project_id ASC

Why it works

The correctness comes from the fact that every employee belongs to exactly one team, and the team average is computed exactly once using all workloads from that team.

After joining the average back to each employee row, every comparison uses the correct team-wide average. Since we filter only rows whose workload is strictly greater than that value, the final result precisely matches the problem definition.

Python Solution

Although LeetCode Database problems are solved in SQL, the following Python implementation demonstrates the same logic programmatically.

from collections import defaultdict
from typing import List, Dict

class Solution:
    def employeesProjectAllocation(
        self,
        project: List[Dict],
        employees: List[Dict]
    ) -> List[Dict]:

        employee_info = {}

        for employee in employees:
            employee_info[employee["employee_id"]] = {
                "name": employee["name"],
                "team": employee["team"]
            }

        team_total = defaultdict(int)
        team_count = defaultdict(int)

        for row in project:
            employee_id = row["employee_id"]
            workload = row["workload"]

            team = employee_info[employee_id]["team"]

            team_total[team] += workload
            team_count[team] += 1

        team_average = {}

        for team in team_total:
            team_average[team] = team_total[team] / team_count[team]

        result = []

        for row in project:
            employee_id = row["employee_id"]
            project_id = row["project_id"]
            workload = row["workload"]

            employee_name = employee_info[employee_id]["name"]
            team = employee_info[employee_id]["team"]

            if workload > team_average[team]:
                result.append({
                    "employee_id": employee_id,
                    "project_id": project_id,
                    "employee_name": employee_name,
                    "project_workload": workload
                })

        result.sort(key=lambda x: (x["employee_id"], x["project_id"]))

        return result

After building a mapping from employee_id to employee metadata, the implementation computes total workload and employee counts for every team.

Using these totals and counts, it derives the average workload for each team.

The algorithm then performs a second pass through all project assignments. For every row, it checks whether the workload exceeds the precomputed team average. If it does, the row is added to the final result.

Finally, the result list is sorted according to the required ordering.

Go Solution

package main

import (
	"sort"
)

type ProjectRow struct {
	ProjectID int
	EmployeeID int
	Workload int
}

type EmployeeRow struct {
	EmployeeID int
	Name string
	Team string
}

type ResultRow struct {
	EmployeeID int
	ProjectID int
	EmployeeName string
	ProjectWorkload int
}

func employeesProjectAllocation(
	project []ProjectRow,
	employees []EmployeeRow,
) []ResultRow {

	type EmployeeInfo struct {
		Name string
		Team string
	}

	employeeInfo := make(map[int]EmployeeInfo)

	for _, employee := range employees {
		employeeInfo[employee.EmployeeID] = EmployeeInfo{
			Name: employee.Name,
			Team: employee.Team,
		}
	}

	teamTotal := make(map[string]int)
	teamCount := make(map[string]int)

	for _, row := range project {
		info := employeeInfo[row.EmployeeID]

		teamTotal[info.Team] += row.Workload
		teamCount[info.Team]++
	}

	teamAverage := make(map[string]float64)

	for team := range teamTotal {
		teamAverage[team] = float64(teamTotal[team]) / float64(teamCount[team])
	}

	result := []ResultRow{}

	for _, row := range project {
		info := employeeInfo[row.EmployeeID]

		if float64(row.Workload) > teamAverage[info.Team] {
			result = append(result, ResultRow{
				EmployeeID: row.EmployeeID,
				ProjectID: row.ProjectID,
				EmployeeName: info.Name,
				ProjectWorkload: row.Workload,
			})
		}
	}

	sort.Slice(result, func(i, j int) bool {
		if result[i].EmployeeID == result[j].EmployeeID {
			return result[i].ProjectID < result[j].ProjectID
		}
		return result[i].EmployeeID < result[j].EmployeeID
	})

	return result
}

The Go implementation follows the same logic as the Python version.

One important Go-specific detail is the explicit conversion to float64 during average calculation and comparison. Without conversion, integer division would truncate decimals and produce incorrect averages.

Go also requires explicit struct definitions for table rows and result rows, whereas Python can conveniently use dictionaries.

Worked Examples

Example 1

Input:

Project:
(1, 1, 45)
(1, 2, 90)
(2, 3, 12)
(2, 4, 68)

Employees:
(1, Khaled, A)
(2, Ali, B)
(3, John, B)
(4, Doe, A)

Step 1: Build Employee Mapping

employee_id name team
1 Khaled A
2 Ali B
3 John B
4 Doe A

Step 2: Compute Team Totals

Team Total Workload Count
A 45 + 68 = 113 2
B 90 + 12 = 102 2

Step 3: Compute Team Averages

Team Average
A 56.5
B 51.0

Step 4: Compare Each Employee

Employee Team Workload Team Avg Included
Khaled A 45 56.5 No
Ali B 90 51.0 Yes
John B 12 51.0 No
Doe A 68 56.5 Yes

Final Output

employee_id project_id employee_name project_workload
2 1 Ali 90
4 2 Doe 68

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each row is processed a constant number of times
Space O(t + n) Stores team averages and output rows

The algorithm performs several linear scans through the input data:

  • one pass to build employee mappings,
  • one pass to compute team totals,
  • one pass to filter results.

Each operation inside the loops is constant time due to hash map lookups.

The auxiliary space comes from:

  • employee metadata storage,
  • team aggregation maps,
  • and the final output list.

Test Cases

from collections import defaultdict

class Solution:
    def employeesProjectAllocation(self, project, employees):
        employee_info = {}

        for employee in employees:
            employee_info[employee["employee_id"]] = {
                "name": employee["name"],
                "team": employee["team"]
            }

        team_total = defaultdict(int)
        team_count = defaultdict(int)

        for row in project:
            team = employee_info[row["employee_id"]]["team"]
            team_total[team] += row["workload"]
            team_count[team] += 1

        team_average = {}

        for team in team_total:
            team_average[team] = team_total[team] / team_count[team]

        result = []

        for row in project:
            employee_id = row["employee_id"]
            team = employee_info[employee_id]["team"]

            if row["workload"] > team_average[team]:
                result.append({
                    "employee_id": employee_id,
                    "project_id": row["project_id"],
                    "employee_name": employee_info[employee_id]["name"],
                    "project_workload": row["workload"]
                })

        result.sort(key=lambda x: (x["employee_id"], x["project_id"]))

        return result

solution = Solution()

# Example test case
project = [
    {"project_id": 1, "employee_id": 1, "workload": 45},
    {"project_id": 1, "employee_id": 2, "workload": 90},
    {"project_id": 2, "employee_id": 3, "workload": 12},
    {"project_id": 2, "employee_id": 4, "workload": 68},
]

employees = [
    {"employee_id": 1, "name": "Khaled", "team": "A"},
    {"employee_id": 2, "name": "Ali", "team": "B"},
    {"employee_id": 3, "name": "John", "team": "B"},
    {"employee_id": 4, "name": "Doe", "team": "A"},
]

assert solution.employeesProjectAllocation(project, employees) == [
    {
        "employee_id": 2,
        "project_id": 1,
        "employee_name": "Ali",
        "project_workload": 90
    },
    {
        "employee_id": 4,
        "project_id": 2,
        "employee_name": "Doe",
        "project_workload": 68
    }
]  # Provided example

# Single employee team
project = [
    {"project_id": 1, "employee_id": 1, "workload": 50},
]

employees = [
    {"employee_id": 1, "name": "Alice", "team": "X"},
]

assert solution.employeesProjectAllocation(project, employees) == []  # Average equals workload

# Equal workloads in same team
project = [
    {"project_id": 1, "employee_id": 1, "workload": 40},
    {"project_id": 2, "employee_id": 2, "workload": 40},
]

employees = [
    {"employee_id": 1, "name": "A", "team": "T"},
    {"employee_id": 2, "name": "B", "team": "T"},
]

assert solution.employeesProjectAllocation(project, employees) == []  # No workload exceeds average

# One employee exceeds average
project = [
    {"project_id": 1, "employee_id": 1, "workload": 10},
    {"project_id": 2, "employee_id": 2, "workload": 90},
]

employees = [
    {"employee_id": 1, "name": "Low", "team": "A"},
    {"employee_id": 2, "name": "High", "team": "A"},
]

assert solution.employeesProjectAllocation(project, employees) == [
    {
        "employee_id": 2,
        "project_id": 2,
        "employee_name": "High",
        "project_workload": 90
    }
]  # High workload employee included

# Multiple teams
project = [
    {"project_id": 1, "employee_id": 1, "workload": 20},
    {"project_id": 2, "employee_id": 2, "workload": 80},
    {"project_id": 3, "employee_id": 3, "workload": 50},
]

employees = [
    {"employee_id": 1, "name": "A", "team": "X"},
    {"employee_id": 2, "name": "B", "team": "X"},
    {"employee_id": 3, "name": "C", "team": "Y"},
]

assert solution.employeesProjectAllocation(project, employees) == [
    {
        "employee_id": 2,
        "project_id": 2,
        "employee_name": "B",
        "project_workload": 80
    }
]  # Only team X has above-average employee
Test Why
Provided example Validates standard functionality
Single employee team Ensures equality with average is excluded
Equal workloads Verifies strict greater-than comparison
One high workload employee Tests proper average filtering
Multiple teams Confirms independent team averages

Edge Cases

Team With Only One Employee

If a team contains exactly one employee, that employee's workload is also the team average.

A common bug is accidentally using >= instead of >, which would incorrectly include the employee. The implementation avoids this by using a strict comparison.

Multiple Employees Sharing the Same Workload

When all employees in a team have identical workloads, the average equals every employee's workload.

In this case, no employee should appear in the output. The algorithm handles this naturally because none of the workloads exceed the average.

Decimal Team Averages

Team averages may be fractional values such as 56.5.

An implementation using integer division could incorrectly truncate the average and produce wrong comparisons. Both the Python and Go implementations correctly use floating-point division to preserve decimal precision.