LeetCode 1303 - Find the Team Size
This problem provides a database table named Employee, where every row represents a single employee and the team they belong to.
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
This problem provides a database table named Employee, where every row represents a single employee and the team they belong to. The table contains two columns:
| Column | Meaning |
|---|---|
employee_id |
Unique identifier for an employee |
team_id |
Identifier of the team the employee belongs to |
The task is to compute the size of each employee's team. In other words, for every employee, we need to determine how many employees share the same team_id.
The output should contain:
| Column | Meaning |
|---|---|
employee_id |
The employee's ID |
team_size |
Number of employees in that employee's team |
The result may be returned in any order, which means we do not need to sort the output unless explicitly required.
The important observation is that the same team_id can appear multiple times, once for each employee in that team. Therefore, the problem essentially asks us to count the frequency of every team_id and then associate that count with each employee.
Because employee_id is guaranteed to be unique, we never need to worry about duplicate employee rows. The problem also guarantees that every employee belongs to exactly one team.
An important edge case is a team with only one employee. In that situation, the team size should be 1. Another important case is when all employees belong to the same team, which means every employee receives the same large count. A naive implementation that repeatedly scans the table for each employee could become inefficient for large datasets.
Since this is a database problem, the intended solution is typically written in SQL using aggregation and joins.
Approaches
Brute Force Approach
The brute force solution checks every employee individually and counts how many employees share the same team_id.
For each row in the table:
- Read the employee's
team_id - Scan the entire table again
- Count how many rows have the same
team_id - Output that count as the employee's team size
This approach is correct because it directly computes the exact number of employees belonging to the same team for every employee.
However, it is inefficient because for every employee we perform another full scan of the table. If there are n employees, then each of the n rows requires an O(n) scan, resulting in O(n²) time complexity.
Optimal Approach
The key insight is that many employees share the same team_id, so repeatedly counting the same team is wasteful.
Instead, we can compute the size of each team once using GROUP BY. After obtaining a mapping from team_id to team size, we join that result back to the original table so every employee receives the corresponding count.
This works efficiently because:
GROUP BY team_idcomputes team sizes once- The join associates each employee with their team's count
- No repeated scanning is necessary
This reduces the work to essentially linear processing of the table.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Repeatedly scans the table for every employee |
| Optimal | O(n) | O(n) | Uses aggregation with GROUP BY and a join |
Algorithm Walkthrough
Optimal SQL Algorithm
- First, group all employees by
team_id.
This step computes how many employees belong to each team. We use SQL's COUNT(*) aggregation function because we want the number of rows per team.
2. Store the grouped result as a temporary derived table.
The derived table contains:
| team_id | team_size |
|---|---|
| 8 | 3 |
| 7 | 1 |
| 9 | 2 |
This prevents recalculating the same counts multiple times.
3. Join the derived table back to the original Employee table.
The join condition is:
Employee.team_id = grouped.team_id
This allows every employee row to retrieve the corresponding team size. 4. Select the required columns.
The final output only needs:
employee_idteam_size
Why it works
The algorithm works because every employee belongs to exactly one team_id, and the grouped table contains exactly one row per team with the correct employee count. Joining on team_id guarantees that every employee receives the correct team size associated with their team.
Python Solution
Although this is a database problem and LeetCode expects SQL, the following Python implementation demonstrates the same logic programmatically.
from typing import List, Dict
from collections import Counter
class Solution:
def findTeamSize(self, employees: List[List[int]]) -> List[List[int]]:
team_counts: Dict[int, int] = Counter()
# Count employees in each team
for employee_id, team_id in employees:
team_counts[team_id] += 1
# Build result
result = []
for employee_id, team_id in employees:
result.append([employee_id, team_counts[team_id]])
return result
The implementation follows the same logic as the optimal SQL approach.
The first loop counts how many employees belong to each team_id. A Counter is used because it provides a convenient frequency map structure where keys are team IDs and values are team sizes.
After building the frequency map, the second loop iterates through the employees again and appends the corresponding team size for each employee.
The final result contains pairs of:
[employee_id, team_size]
This mirrors the expected SQL output structure.
Go Solution
package main
func findTeamSize(employees [][]int) [][]int {
teamCounts := make(map[int]int)
// Count employees in each team
for _, employee := range employees {
teamID := employee[1]
teamCounts[teamID]++
}
// Build result
result := [][]int{}
for _, employee := range employees {
employeeID := employee[0]
teamID := employee[1]
result = append(result, []int{
employeeID,
teamCounts[teamID],
})
}
return result
}
The Go implementation closely mirrors the Python version. A map[int]int is used to store team frequencies. Go slices are used to construct the result dynamically.
Unlike Python's Counter, Go requires explicit map initialization and manual incrementing. Other than syntax differences, the algorithm remains identical.
Because all values are small integer counts, integer overflow is not a concern in normal problem constraints.
SQL Solution
SELECT
e.employee_id,
t.team_size
FROM Employee e
JOIN (
SELECT
team_id,
COUNT(*) AS team_size
FROM Employee
GROUP BY team_id
) t
ON e.team_id = t.team_id;
This solution first computes the size of every team using GROUP BY. The derived table is then joined back to the original Employee table to associate each employee with their corresponding team size.
Worked Examples
Example 1
Input table:
| employee_id | team_id |
|---|---|
| 1 | 8 |
| 2 | 8 |
| 3 | 8 |
| 4 | 7 |
| 5 | 9 |
| 6 | 9 |
Step 1: Group by team_id
We count employees in each team.
| team_id | count |
|---|---|
| 8 | 3 |
| 7 | 1 |
| 9 | 2 |
This creates the mapping:
{
8: 3,
7: 1,
9: 2
}
Step 2: Assign counts back to employees
| employee_id | team_id | team_size |
|---|---|---|
| 1 | 8 | 3 |
| 2 | 8 | 3 |
| 3 | 8 | 3 |
| 4 | 7 | 1 |
| 5 | 9 | 2 |
| 6 | 9 | 2 |
Final output:
| employee_id | team_size |
|---|---|
| 1 | 3 |
| 2 | 3 |
| 3 | 3 |
| 4 | 1 |
| 5 | 2 |
| 6 | 2 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | One pass to count teams, one pass to build results |
| Space | O(n) | Hash map stores team frequencies |
The algorithm processes each employee a constant number of times. The frequency map contains at most one entry per unique team, which in the worst case could equal the number of employees.
The SQL version is similarly efficient because aggregation and joins are optimized database operations.
Test Cases
from collections import Counter
from typing import List, Dict
class Solution:
def findTeamSize(self, employees: List[List[int]]) -> List[List[int]]:
team_counts: Dict[int, int] = Counter()
for employee_id, team_id in employees:
team_counts[team_id] += 1
result = []
for employee_id, team_id in employees:
result.append([employee_id, team_counts[team_id]])
return result
solution = Solution()
# Provided example
assert solution.findTeamSize([
[1, 8],
[2, 8],
[3, 8],
[4, 7],
[5, 9],
[6, 9]
]) == [
[1, 3],
[2, 3],
[3, 3],
[4, 1],
[5, 2],
[6, 2]
] # multiple teams of different sizes
# Single employee
assert solution.findTeamSize([
[1, 1]
]) == [
[1, 1]
] # smallest possible input
# All employees in same team
assert solution.findTeamSize([
[1, 5],
[2, 5],
[3, 5],
[4, 5]
]) == [
[1, 4],
[2, 4],
[3, 4],
[4, 4]
] # everyone shares same team
# Every employee in unique team
assert solution.findTeamSize([
[1, 1],
[2, 2],
[3, 3]
]) == [
[1, 1],
[2, 1],
[3, 1]
] # all team sizes equal 1
# Mixed team sizes
assert solution.findTeamSize([
[10, 100],
[20, 200],
[30, 100],
[40, 300],
[50, 300],
[60, 300]
]) == [
[10, 2],
[20, 1],
[30, 2],
[40, 3],
[50, 3],
[60, 3]
] # verifies multiple distinct groups
| Test | Why |
|---|---|
| Example input | Verifies standard multi-team scenario |
| Single employee | Validates minimum input size |
| All same team | Ensures large shared counts work correctly |
| All unique teams | Ensures singleton teams return size 1 |
| Mixed team sizes | Verifies multiple independent group counts |
Edge Cases
Team with Only One Employee
A common bug is incorrectly assuming every team contains multiple employees. If a team appears only once in the table, its count must still be computed correctly as 1.
The implementation handles this naturally because the counting phase increments frequencies exactly once per occurrence. A team appearing once simply gets count 1.
All Employees Belong to the Same Team
Another edge case occurs when every employee shares the same team_id. A naive implementation might repeatedly recompute the same count for every employee, leading to unnecessary work.
The optimized approach avoids this problem because the team count is computed only once and reused for all employees.
Every Employee Has a Unique Team
In this case, every team size equals 1. This can expose bugs where developers mistakenly aggregate incorrectly or assume duplicate teams always exist.
The frequency map correctly creates one entry per unique team and assigns a count of 1 to each employee.
Large Number of Employees
For large datasets, the brute force approach becomes inefficient because it repeatedly scans the entire table.
The optimized solution scales efficiently because it only performs linear passes over the data and uses constant-time hash map lookups.