LeetCode 3057 - Employees Project Allocation
This problem asks us to identify employees whose assigned project workload is greater than the average workload of employees within their own team. We are given two database tables: The Project table stores information about project assignments.
Difficulty: 🔴 Hard
Topics: Database
Solution
Problem Understanding
This problem asks us to identify employees whose assigned project workload is greater than the average workload of employees within their own team.
We are given two database tables:
The Project table stores information about project assignments. Each row tells us:
- which project an employee is working on,
- which employee is assigned,
- and the workload value associated with that assignment.
The Employees table stores employee metadata, specifically:
- the employee's name,
- and the team the employee belongs to.
The goal is to compare every employee's workload against the average workload of employees in the same team.
To do that correctly, we must first associate each project row with its employee's team. Since the team information exists only in the Employees table, we need a join between the two tables using employee_id.
After grouping employees by team, we compute the average workload for each team. Then we keep only those project assignments where:
employee workload > average workload of that employee's team
The output must include:
employee_idproject_idemployee_nameproject_workload
The results must be sorted by:
employee_idproject_id
both in ascending order.
An important detail is that the comparison is strictly greater than the team average. If an employee's workload equals the average exactly, they must not be included.
The constraints are not explicitly listed, but this is a database problem intended for SQL optimization. That means we should assume the dataset may be large enough that repeated recalculation of averages would be inefficient. Efficient aggregation and joins are therefore important.
Several edge cases are important:
- Teams containing only one employee. In that case, the employee's workload equals the team average, so they should not appear.
- Multiple employees sharing the same team average.
- Employees whose workload is exactly equal to the team average.
- Teams with many employees and multiple projects.
- Ordering correctness when multiple rows satisfy the condition.
Approaches
Brute Force Approach
A brute force solution would process every project assignment individually.
For each employee-project row, we could:
- Find the employee's team.
- Scan all employees in that same team.
- Compute the average workload for that team.
- Compare the current workload against the computed average.
- Include the row if it exceeds the average.
This approach is correct because every row independently computes the exact team average before making the comparison.
However, it is inefficient because the same team averages are recalculated repeatedly. If a team has many employees, the algorithm repeatedly scans identical data for every row belonging to that team.
This repeated aggregation creates unnecessary work and does not scale well.
Optimal Approach
The key observation is that the average workload for a team is identical for every employee in that team.
Instead of recomputing it repeatedly, we should compute each team's average workload once, store the result, and reuse it.
In SQL, this is naturally handled using either:
- a Common Table Expression (CTE),
- a subquery,
- or a window function.
The optimal solution works in two phases:
- Compute average workload grouped by team.
- Join that result back to the original rows and filter workloads greater than the team average.
This avoids repeated scans and makes the query efficient and scalable.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Recomputes team averages repeatedly |
| Optimal | O(n) | O(t) | Computes each team average once, where t is number of teams |
Algorithm Walkthrough
- Join the
ProjectandEmployeestables usingemployee_id.
This step associates every project assignment with the employee's team and name. Without this join, we cannot determine which team average should be used.
2. Group rows by team and compute the average workload.
For each team, calculate:
$$\text{average workload} = \frac{\text{sum of workloads}}{\text{number of employees}}$$
This produces one average value per team. 3. Store these averages in a temporary result set.
A Common Table Expression named something like team_avg is ideal because it keeps the query readable and avoids recomputation.
4. Join the original project rows with the computed team averages.
Each employee-project row now has access to:
- employee name,
- team,
- workload,
- and the average workload of the team.
- Filter rows where:
$$\text{workload} > \text{team average}$$
Only employees exceeding their team's average remain. 6. Select the required columns.
The output should include:
employee_idproject_idemployee_nameproject_workload
- Sort the result.
Order by:
employee_id ASCproject_id ASC
Why it works
The correctness comes from the fact that every employee belongs to exactly one team, and the team average is computed exactly once using all workloads from that team.
After joining the average back to each employee row, every comparison uses the correct team-wide average. Since we filter only rows whose workload is strictly greater than that value, the final result precisely matches the problem definition.
Python Solution
Although LeetCode Database problems are solved in SQL, the following Python implementation demonstrates the same logic programmatically.
from collections import defaultdict
from typing import List, Dict
class Solution:
def employeesProjectAllocation(
self,
project: List[Dict],
employees: List[Dict]
) -> List[Dict]:
employee_info = {}
for employee in employees:
employee_info[employee["employee_id"]] = {
"name": employee["name"],
"team": employee["team"]
}
team_total = defaultdict(int)
team_count = defaultdict(int)
for row in project:
employee_id = row["employee_id"]
workload = row["workload"]
team = employee_info[employee_id]["team"]
team_total[team] += workload
team_count[team] += 1
team_average = {}
for team in team_total:
team_average[team] = team_total[team] / team_count[team]
result = []
for row in project:
employee_id = row["employee_id"]
project_id = row["project_id"]
workload = row["workload"]
employee_name = employee_info[employee_id]["name"]
team = employee_info[employee_id]["team"]
if workload > team_average[team]:
result.append({
"employee_id": employee_id,
"project_id": project_id,
"employee_name": employee_name,
"project_workload": workload
})
result.sort(key=lambda x: (x["employee_id"], x["project_id"]))
return result
After building a mapping from employee_id to employee metadata, the implementation computes total workload and employee counts for every team.
Using these totals and counts, it derives the average workload for each team.
The algorithm then performs a second pass through all project assignments. For every row, it checks whether the workload exceeds the precomputed team average. If it does, the row is added to the final result.
Finally, the result list is sorted according to the required ordering.
Go Solution
package main
import (
"sort"
)
type ProjectRow struct {
ProjectID int
EmployeeID int
Workload int
}
type EmployeeRow struct {
EmployeeID int
Name string
Team string
}
type ResultRow struct {
EmployeeID int
ProjectID int
EmployeeName string
ProjectWorkload int
}
func employeesProjectAllocation(
project []ProjectRow,
employees []EmployeeRow,
) []ResultRow {
type EmployeeInfo struct {
Name string
Team string
}
employeeInfo := make(map[int]EmployeeInfo)
for _, employee := range employees {
employeeInfo[employee.EmployeeID] = EmployeeInfo{
Name: employee.Name,
Team: employee.Team,
}
}
teamTotal := make(map[string]int)
teamCount := make(map[string]int)
for _, row := range project {
info := employeeInfo[row.EmployeeID]
teamTotal[info.Team] += row.Workload
teamCount[info.Team]++
}
teamAverage := make(map[string]float64)
for team := range teamTotal {
teamAverage[team] = float64(teamTotal[team]) / float64(teamCount[team])
}
result := []ResultRow{}
for _, row := range project {
info := employeeInfo[row.EmployeeID]
if float64(row.Workload) > teamAverage[info.Team] {
result = append(result, ResultRow{
EmployeeID: row.EmployeeID,
ProjectID: row.ProjectID,
EmployeeName: info.Name,
ProjectWorkload: row.Workload,
})
}
}
sort.Slice(result, func(i, j int) bool {
if result[i].EmployeeID == result[j].EmployeeID {
return result[i].ProjectID < result[j].ProjectID
}
return result[i].EmployeeID < result[j].EmployeeID
})
return result
}
The Go implementation follows the same logic as the Python version.
One important Go-specific detail is the explicit conversion to float64 during average calculation and comparison. Without conversion, integer division would truncate decimals and produce incorrect averages.
Go also requires explicit struct definitions for table rows and result rows, whereas Python can conveniently use dictionaries.
Worked Examples
Example 1
Input:
Project:
(1, 1, 45)
(1, 2, 90)
(2, 3, 12)
(2, 4, 68)
Employees:
(1, Khaled, A)
(2, Ali, B)
(3, John, B)
(4, Doe, A)
Step 1: Build Employee Mapping
| employee_id | name | team |
|---|---|---|
| 1 | Khaled | A |
| 2 | Ali | B |
| 3 | John | B |
| 4 | Doe | A |
Step 2: Compute Team Totals
| Team | Total Workload | Count |
|---|---|---|
| A | 45 + 68 = 113 | 2 |
| B | 90 + 12 = 102 | 2 |
Step 3: Compute Team Averages
| Team | Average |
|---|---|
| A | 56.5 |
| B | 51.0 |
Step 4: Compare Each Employee
| Employee | Team | Workload | Team Avg | Included |
|---|---|---|---|---|
| Khaled | A | 45 | 56.5 | No |
| Ali | B | 90 | 51.0 | Yes |
| John | B | 12 | 51.0 | No |
| Doe | A | 68 | 56.5 | Yes |
Final Output
| employee_id | project_id | employee_name | project_workload |
|---|---|---|---|
| 2 | 1 | Ali | 90 |
| 4 | 2 | Doe | 68 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each row is processed a constant number of times |
| Space | O(t + n) | Stores team averages and output rows |
The algorithm performs several linear scans through the input data:
- one pass to build employee mappings,
- one pass to compute team totals,
- one pass to filter results.
Each operation inside the loops is constant time due to hash map lookups.
The auxiliary space comes from:
- employee metadata storage,
- team aggregation maps,
- and the final output list.
Test Cases
from collections import defaultdict
class Solution:
def employeesProjectAllocation(self, project, employees):
employee_info = {}
for employee in employees:
employee_info[employee["employee_id"]] = {
"name": employee["name"],
"team": employee["team"]
}
team_total = defaultdict(int)
team_count = defaultdict(int)
for row in project:
team = employee_info[row["employee_id"]]["team"]
team_total[team] += row["workload"]
team_count[team] += 1
team_average = {}
for team in team_total:
team_average[team] = team_total[team] / team_count[team]
result = []
for row in project:
employee_id = row["employee_id"]
team = employee_info[employee_id]["team"]
if row["workload"] > team_average[team]:
result.append({
"employee_id": employee_id,
"project_id": row["project_id"],
"employee_name": employee_info[employee_id]["name"],
"project_workload": row["workload"]
})
result.sort(key=lambda x: (x["employee_id"], x["project_id"]))
return result
solution = Solution()
# Example test case
project = [
{"project_id": 1, "employee_id": 1, "workload": 45},
{"project_id": 1, "employee_id": 2, "workload": 90},
{"project_id": 2, "employee_id": 3, "workload": 12},
{"project_id": 2, "employee_id": 4, "workload": 68},
]
employees = [
{"employee_id": 1, "name": "Khaled", "team": "A"},
{"employee_id": 2, "name": "Ali", "team": "B"},
{"employee_id": 3, "name": "John", "team": "B"},
{"employee_id": 4, "name": "Doe", "team": "A"},
]
assert solution.employeesProjectAllocation(project, employees) == [
{
"employee_id": 2,
"project_id": 1,
"employee_name": "Ali",
"project_workload": 90
},
{
"employee_id": 4,
"project_id": 2,
"employee_name": "Doe",
"project_workload": 68
}
] # Provided example
# Single employee team
project = [
{"project_id": 1, "employee_id": 1, "workload": 50},
]
employees = [
{"employee_id": 1, "name": "Alice", "team": "X"},
]
assert solution.employeesProjectAllocation(project, employees) == [] # Average equals workload
# Equal workloads in same team
project = [
{"project_id": 1, "employee_id": 1, "workload": 40},
{"project_id": 2, "employee_id": 2, "workload": 40},
]
employees = [
{"employee_id": 1, "name": "A", "team": "T"},
{"employee_id": 2, "name": "B", "team": "T"},
]
assert solution.employeesProjectAllocation(project, employees) == [] # No workload exceeds average
# One employee exceeds average
project = [
{"project_id": 1, "employee_id": 1, "workload": 10},
{"project_id": 2, "employee_id": 2, "workload": 90},
]
employees = [
{"employee_id": 1, "name": "Low", "team": "A"},
{"employee_id": 2, "name": "High", "team": "A"},
]
assert solution.employeesProjectAllocation(project, employees) == [
{
"employee_id": 2,
"project_id": 2,
"employee_name": "High",
"project_workload": 90
}
] # High workload employee included
# Multiple teams
project = [
{"project_id": 1, "employee_id": 1, "workload": 20},
{"project_id": 2, "employee_id": 2, "workload": 80},
{"project_id": 3, "employee_id": 3, "workload": 50},
]
employees = [
{"employee_id": 1, "name": "A", "team": "X"},
{"employee_id": 2, "name": "B", "team": "X"},
{"employee_id": 3, "name": "C", "team": "Y"},
]
assert solution.employeesProjectAllocation(project, employees) == [
{
"employee_id": 2,
"project_id": 2,
"employee_name": "B",
"project_workload": 80
}
] # Only team X has above-average employee
| Test | Why |
|---|---|
| Provided example | Validates standard functionality |
| Single employee team | Ensures equality with average is excluded |
| Equal workloads | Verifies strict greater-than comparison |
| One high workload employee | Tests proper average filtering |
| Multiple teams | Confirms independent team averages |
Edge Cases
Team With Only One Employee
If a team contains exactly one employee, that employee's workload is also the team average.
A common bug is accidentally using >= instead of >, which would incorrectly include the employee. The implementation avoids this by using a strict comparison.
Multiple Employees Sharing the Same Workload
When all employees in a team have identical workloads, the average equals every employee's workload.
In this case, no employee should appear in the output. The algorithm handles this naturally because none of the workloads exceed the average.
Decimal Team Averages
Team averages may be fractional values such as 56.5.
An implementation using integer division could incorrectly truncate the average and produce wrong comparisons. Both the Python and Go implementations correctly use floating-point division to preserve decimal precision.