LeetCode 1076 - Project Employees II

This problem asks us to identify the project or projects that have the largest number of employees assigned to them. We are given two database tables. The Project table stores relationships between projects and employees.

LeetCode Problem 1076

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

This problem asks us to identify the project or projects that have the largest number of employees assigned to them.

We are given two database tables. The Project table stores relationships between projects and employees. Each row means that a particular employee is working on a particular project. The pair (project_id, employee_id) is unique, which guarantees that the same employee cannot appear twice for the same project.

The Employee table contains additional employee information such as name and years of experience. However, for this specific problem, the Employee table is actually irrelevant because we only need to count how many employees belong to each project. We do not need employee names or experience data.

The task is to return every project_id whose employee count is equal to the maximum employee count among all projects.

In other words, the solution requires two major operations:

  1. Count the number of employees assigned to every project.
  2. Find the maximum count and return all projects matching that value.

The result can be returned in any order, so there is no sorting requirement.

The input size is small because this is an Easy SQL problem, but the important detail is that multiple projects can tie for the highest employee count. A naive implementation that returns only one project would be incorrect.

Some important edge cases include situations where:

  • Multiple projects have the same maximum employee count.
  • There is only one project in the table.
  • Every project has exactly one employee.
  • The table contains many projects with varying counts.

The problem guarantees valid relational data and unique (project_id, employee_id) pairs, so we never need to worry about duplicate employee assignments within the same project.

Approaches

Brute Force Approach

A brute force solution would first retrieve all distinct projects. Then, for each project, we would repeatedly scan the Project table to count how many employees belong to that project.

After computing all counts, we would determine the maximum value and collect all matching projects.

This approach is correct because every project is independently counted. However, it is inefficient because the table is scanned repeatedly. If there are P projects and N rows in the Project table, the repeated scanning leads to unnecessary work.

Optimal Approach

The key observation is that SQL aggregation already provides an efficient mechanism for counting grouped records.

We can use:

GROUP BY project_id

to compute employee counts for every project in a single pass.

Once we have the counts, we only need to determine which count is the largest. Then we return every project whose count matches that maximum.

This can be done cleanly using a subquery or common table expression.

The optimal solution avoids repeated scans and leverages database aggregation efficiently.

Approach Time Complexity Space Complexity Notes
Brute Force O(P × N) O(P) Repeatedly scans the table for every project
Optimal O(N) O(P) Uses SQL aggregation with GROUP BY

Algorithm Walkthrough

  1. Group all rows in the Project table by project_id.

This step gathers all employee assignments belonging to the same project together. 2. Count the number of employees in each group.

Using COUNT(employee_id) gives the number of employees assigned to each project. 3. Find the maximum employee count across all projects.

This tells us the largest project size. 4. Return all projects whose employee count equals the maximum value.

Multiple projects may share the same highest count, so we must return every matching project.

Why it works

The algorithm works because every row in the Project table represents exactly one employee assignment. Grouping by project_id partitions the table into separate projects, and counting rows inside each group produces the exact number of employees assigned to that project.

Selecting projects whose counts equal the global maximum guarantees that every returned project has the largest employee count, and no smaller project is included.

Python Solution

# Write your MySQL query statement below

WITH project_counts AS (
    SELECT
        project_id,
        COUNT(employee_id) AS employee_count
    FROM Project
    GROUP BY project_id
)

SELECT project_id
FROM project_counts
WHERE employee_count = (
    SELECT MAX(employee_count)
    FROM project_counts
);

The solution begins by creating a common table expression named project_counts. This intermediate result stores each project_id together with its employee count.

The GROUP BY project_id clause ensures that all rows belonging to the same project are aggregated together. The COUNT(employee_id) operation computes how many employees are assigned to that project.

After computing these counts, the outer query selects only the projects whose employee_count equals the maximum value found in the entire grouped dataset.

Using a common table expression improves readability because it avoids repeating the aggregation logic multiple times.

Go Solution

// There is no Go solution for this problem because LeetCode 1076
// is a Database(SQL) problem rather than an algorithmic coding problem.
//
// MySQL Solution:

/*
WITH project_counts AS (
    SELECT
        project_id,
        COUNT(employee_id) AS employee_count
    FROM Project
    GROUP BY project_id
)

SELECT project_id
FROM project_counts
WHERE employee_count = (
    SELECT MAX(employee_count)
    FROM project_counts
);
*/

Since this is a database problem, LeetCode expects a SQL query instead of executable Go code. Therefore, there is no actual Go implementation required.

Worked Examples

Example 1

Input:

Project table:

project_id employee_id
1 1
1 2
1 3
2 1
2 4

Step 1, Group by project

project_id Employees
1 1, 2, 3
2 1, 4

Step 2, Count employees

project_id employee_count
1 3
2 2

Step 3, Find maximum count

The maximum employee count is:

3

Step 4, Return matching projects

| project_id |

|---|---|

| 1 |

Project 1 is returned because it has the highest employee count.

Complexity Analysis

Measure Complexity Explanation
Time O(N) Each row is processed once during aggregation
Space O(P) Stores grouped counts for each project

The database performs a single grouping pass over the Project table. Each row contributes to exactly one grouped count, so the runtime is linear in the number of rows.

The additional memory usage depends on the number of distinct projects because the grouped aggregation maintains one entry per project.

Test Cases

# Example 1
project = [
    [1, 1],
    [1, 2],
    [1, 3],
    [2, 1],
    [2, 4]
]
expected = [1]  # Project 1 has the most employees

# Single project
project = [
    [1, 1],
    [1, 2]
]
expected = [1]  # Only one project exists

# Tie between projects
project = [
    [1, 1],
    [1, 2],
    [2, 3],
    [2, 4]
]
expected = [1, 2]  # Both projects have 2 employees

# Every project has one employee
project = [
    [1, 1],
    [2, 2],
    [3, 3]
]
expected = [1, 2, 3]  # All tie with count 1

# Larger project dominates
project = [
    [1, 1],
    [1, 2],
    [1, 3],
    [1, 4],
    [2, 5]
]
expected = [1]  # Project 1 clearly largest

# Multiple projects with varying counts
project = [
    [1, 1],
    [1, 2],
    [2, 3],
    [3, 4],
    [3, 5]
]
expected = [1, 3]  # Projects 1 and 3 tie
Test Why
Example 1 Validates the basic scenario
Single project Ensures one project is handled correctly
Tie between projects Verifies multiple answers are returned
Every project has one employee Tests full tie case
Larger project dominates Tests clear maximum detection
Multiple varying counts Verifies mixed distributions

Edge Cases

One important edge case occurs when multiple projects share the same maximum employee count. A common mistake is to use LIMIT 1 or otherwise return only one project. The implementation avoids this issue by selecting all projects whose count equals the maximum value.

Another important case is when there is only a single project in the table. In this scenario, that project is automatically the largest project. The grouping and maximum comparison logic still works correctly because the maximum count is simply the only count present.

A third edge case happens when every project has exactly one employee. In this situation, every project ties for the maximum count. The query correctly returns all projects because every grouped count equals the global maximum.

A final subtle case involves duplicate employee assignments. The problem guarantees that (project_id, employee_id) is unique, so we never need additional deduplication logic such as COUNT(DISTINCT employee_id). The implementation safely relies on the table constraints.