LeetCode 1303 - Find the Team Size

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

This problem provides a database table named Employee, where every row represents a single employee and the team they belong to. The table contains two columns:

Column	Meaning
`employee_id`	Unique identifier for an employee
`team_id`	Identifier of the team the employee belongs to

The task is to compute the size of each employee's team. In other words, for every employee, we need to determine how many employees share the same team_id.

The output should contain:

Column	Meaning
`employee_id`	The employee's ID
`team_size`	Number of employees in that employee's team

The result may be returned in any order, which means we do not need to sort the output unless explicitly required.

The important observation is that the same team_id can appear multiple times, once for each employee in that team. Therefore, the problem essentially asks us to count the frequency of every team_id and then associate that count with each employee.

Because employee_id is guaranteed to be unique, we never need to worry about duplicate employee rows. The problem also guarantees that every employee belongs to exactly one team.

An important edge case is a team with only one employee. In that situation, the team size should be 1. Another important case is when all employees belong to the same team, which means every employee receives the same large count. A naive implementation that repeatedly scans the table for each employee could become inefficient for large datasets.

Since this is a database problem, the intended solution is typically written in SQL using aggregation and joins.

Approaches

Brute Force Approach

The brute force solution checks every employee individually and counts how many employees share the same team_id.

For each row in the table:

Read the employee's team_id
Scan the entire table again
Count how many rows have the same team_id
Output that count as the employee's team size

This approach is correct because it directly computes the exact number of employees belonging to the same team for every employee.

However, it is inefficient because for every employee we perform another full scan of the table. If there are n employees, then each of the n rows requires an O(n) scan, resulting in O(n²) time complexity.

Optimal Approach

The key insight is that many employees share the same team_id, so repeatedly counting the same team is wasteful.

Instead, we can compute the size of each team once using GROUP BY. After obtaining a mapping from team_id to team size, we join that result back to the original table so every employee receives the corresponding count.

This works efficiently because:

GROUP BY team_id computes team sizes once
The join associates each employee with their team's count
No repeated scanning is necessary

This reduces the work to essentially linear processing of the table.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n²)	O(1)	Repeatedly scans the table for every employee
Optimal	O(n)	O(n)	Uses aggregation with `GROUP BY` and a join

Algorithm Walkthrough

Optimal SQL Algorithm

First, group all employees by team_id.

This step computes how many employees belong to each team. We use SQL's COUNT(*) aggregation function because we want the number of rows per team. 2. Store the grouped result as a temporary derived table.

The derived table contains:

team_id	team_size
8	3
7	1
9	2

This prevents recalculating the same counts multiple times. 3. Join the derived table back to the original Employee table.

The join condition is:

Employee.team_id = grouped.team_id

This allows every employee row to retrieve the corresponding team size. 4. Select the required columns.

The final output only needs:

employee_id
team_size

Why it works

The algorithm works because every employee belongs to exactly one team_id, and the grouped table contains exactly one row per team with the correct employee count. Joining on team_id guarantees that every employee receives the correct team size associated with their team.

Python Solution

Although this is a database problem and LeetCode expects SQL, the following Python implementation demonstrates the same logic programmatically.

from typing import List, Dict
from collections import Counter

class Solution:
    def findTeamSize(self, employees: List[List[int]]) -> List[List[int]]:
        team_counts: Dict[int, int] = Counter()

        # Count employees in each team
        for employee_id, team_id in employees:
            team_counts[team_id] += 1

        # Build result
        result = []

        for employee_id, team_id in employees:
            result.append([employee_id, team_counts[team_id]])

        return result

The implementation follows the same logic as the optimal SQL approach.

The first loop counts how many employees belong to each team_id. A Counter is used because it provides a convenient frequency map structure where keys are team IDs and values are team sizes.

After building the frequency map, the second loop iterates through the employees again and appends the corresponding team size for each employee.

The final result contains pairs of:

[employee_id, team_size]

This mirrors the expected SQL output structure.

Go Solution

package main

func findTeamSize(employees [][]int) [][]int {
	teamCounts := make(map[int]int)

	// Count employees in each team
	for _, employee := range employees {
		teamID := employee[1]
		teamCounts[teamID]++
	}

	// Build result
	result := [][]int{}

	for _, employee := range employees {
		employeeID := employee[0]
		teamID := employee[1]

		result = append(result, []int{
			employeeID,
			teamCounts[teamID],
		})
	}

	return result
}

The Go implementation closely mirrors the Python version. A map[int]int is used to store team frequencies. Go slices are used to construct the result dynamically.

Unlike Python's Counter, Go requires explicit map initialization and manual incrementing. Other than syntax differences, the algorithm remains identical.

Because all values are small integer counts, integer overflow is not a concern in normal problem constraints.

SQL Solution

SELECT
    e.employee_id,
    t.team_size
FROM Employee e
JOIN (
    SELECT
        team_id,
        COUNT(*) AS team_size
    FROM Employee
    GROUP BY team_id
) t
ON e.team_id = t.team_id;

This solution first computes the size of every team using GROUP BY. The derived table is then joined back to the original Employee table to associate each employee with their corresponding team size.

Worked Examples

Example 1

Input table:

employee_id	team_id
1	8
2	8
3	8
4	7
5	9
6	9

Step 1: Group by `team_id`

We count employees in each team.

team_id	count
8	3
7	1
9	2

This creates the mapping:

{
    8: 3,
    7: 1,
    9: 2
}

Step 2: Assign counts back to employees

employee_id	team_id	team_size
1	8	3
2	8	3
3	8	3
4	7	1
5	9	2
6	9	2

Final output:

employee_id	team_size
1	3
2	3
3	3
4	1
5	2
6	2

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	One pass to count teams, one pass to build results
Space	O(n)	Hash map stores team frequencies

The algorithm processes each employee a constant number of times. The frequency map contains at most one entry per unique team, which in the worst case could equal the number of employees.

The SQL version is similarly efficient because aggregation and joins are optimized database operations.

Test Cases

from collections import Counter
from typing import List, Dict

class Solution:
    def findTeamSize(self, employees: List[List[int]]) -> List[List[int]]:
        team_counts: Dict[int, int] = Counter()

        for employee_id, team_id in employees:
            team_counts[team_id] += 1

        result = []

        for employee_id, team_id in employees:
            result.append([employee_id, team_counts[team_id]])

        return result

solution = Solution()

# Provided example
assert solution.findTeamSize([
    [1, 8],
    [2, 8],
    [3, 8],
    [4, 7],
    [5, 9],
    [6, 9]
]) == [
    [1, 3],
    [2, 3],
    [3, 3],
    [4, 1],
    [5, 2],
    [6, 2]
]  # multiple teams of different sizes

# Single employee
assert solution.findTeamSize([
    [1, 1]
]) == [
    [1, 1]
]  # smallest possible input

# All employees in same team
assert solution.findTeamSize([
    [1, 5],
    [2, 5],
    [3, 5],
    [4, 5]
]) == [
    [1, 4],
    [2, 4],
    [3, 4],
    [4, 4]
]  # everyone shares same team

# Every employee in unique team
assert solution.findTeamSize([
    [1, 1],
    [2, 2],
    [3, 3]
]) == [
    [1, 1],
    [2, 1],
    [3, 1]
]  # all team sizes equal 1

# Mixed team sizes
assert solution.findTeamSize([
    [10, 100],
    [20, 200],
    [30, 100],
    [40, 300],
    [50, 300],
    [60, 300]
]) == [
    [10, 2],
    [20, 1],
    [30, 2],
    [40, 3],
    [50, 3],
    [60, 3]
]  # verifies multiple distinct groups

Test	Why
Example input	Verifies standard multi-team scenario
Single employee	Validates minimum input size
All same team	Ensures large shared counts work correctly
All unique teams	Ensures singleton teams return size 1
Mixed team sizes	Verifies multiple independent group counts

Edge Cases

Team with Only One Employee

A common bug is incorrectly assuming every team contains multiple employees. If a team appears only once in the table, its count must still be computed correctly as 1.

The implementation handles this naturally because the counting phase increments frequencies exactly once per occurrence. A team appearing once simply gets count 1.

All Employees Belong to the Same Team

Another edge case occurs when every employee shares the same team_id. A naive implementation might repeatedly recompute the same count for every employee, leading to unnecessary work.

The optimized approach avoids this problem because the team count is computed only once and reused for all employees.

Every Employee Has a Unique Team

In this case, every team size equals 1. This can expose bugs where developers mistakenly aggregate incorrectly or assume duplicate teams always exist.

The frequency map correctly creates one entry per unique team and assigns a count of 1 to each employee.

Large Number of Employees

For large datasets, the brute force approach becomes inefficient because it repeatedly scans the entire table.

The optimized solution scales efficiently because it only performs linear passes over the data and uses constant-time hash map lookups.