LeetCode 1303 - Find the Team Size

This problem provides a database table named Employee, where every row represents a single employee and the team they belong to.

LeetCode Problem 1303

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

This problem provides a database table named Employee, where every row represents a single employee and the team they belong to. The table contains two columns:

Column Meaning
employee_id Unique identifier for an employee
team_id Identifier of the team the employee belongs to

The task is to compute the size of each employee's team. In other words, for every employee, we need to determine how many employees share the same team_id.

The output should contain:

Column Meaning
employee_id The employee's ID
team_size Number of employees in that employee's team

The result may be returned in any order, which means we do not need to sort the output unless explicitly required.

The important observation is that the same team_id can appear multiple times, once for each employee in that team. Therefore, the problem essentially asks us to count the frequency of every team_id and then associate that count with each employee.

Because employee_id is guaranteed to be unique, we never need to worry about duplicate employee rows. The problem also guarantees that every employee belongs to exactly one team.

An important edge case is a team with only one employee. In that situation, the team size should be 1. Another important case is when all employees belong to the same team, which means every employee receives the same large count. A naive implementation that repeatedly scans the table for each employee could become inefficient for large datasets.

Since this is a database problem, the intended solution is typically written in SQL using aggregation and joins.

Approaches

Brute Force Approach

The brute force solution checks every employee individually and counts how many employees share the same team_id.

For each row in the table:

  1. Read the employee's team_id
  2. Scan the entire table again
  3. Count how many rows have the same team_id
  4. Output that count as the employee's team size

This approach is correct because it directly computes the exact number of employees belonging to the same team for every employee.

However, it is inefficient because for every employee we perform another full scan of the table. If there are n employees, then each of the n rows requires an O(n) scan, resulting in O(n²) time complexity.

Optimal Approach

The key insight is that many employees share the same team_id, so repeatedly counting the same team is wasteful.

Instead, we can compute the size of each team once using GROUP BY. After obtaining a mapping from team_id to team size, we join that result back to the original table so every employee receives the corresponding count.

This works efficiently because:

  1. GROUP BY team_id computes team sizes once
  2. The join associates each employee with their team's count
  3. No repeated scanning is necessary

This reduces the work to essentially linear processing of the table.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) Repeatedly scans the table for every employee
Optimal O(n) O(n) Uses aggregation with GROUP BY and a join

Algorithm Walkthrough

Optimal SQL Algorithm

  1. First, group all employees by team_id.

This step computes how many employees belong to each team. We use SQL's COUNT(*) aggregation function because we want the number of rows per team. 2. Store the grouped result as a temporary derived table.

The derived table contains:

team_id team_size
8 3
7 1
9 2

This prevents recalculating the same counts multiple times. 3. Join the derived table back to the original Employee table.

The join condition is:

Employee.team_id = grouped.team_id

This allows every employee row to retrieve the corresponding team size. 4. Select the required columns.

The final output only needs:

  • employee_id
  • team_size

Why it works

The algorithm works because every employee belongs to exactly one team_id, and the grouped table contains exactly one row per team with the correct employee count. Joining on team_id guarantees that every employee receives the correct team size associated with their team.

Python Solution

Although this is a database problem and LeetCode expects SQL, the following Python implementation demonstrates the same logic programmatically.

from typing import List, Dict
from collections import Counter

class Solution:
    def findTeamSize(self, employees: List[List[int]]) -> List[List[int]]:
        team_counts: Dict[int, int] = Counter()

        # Count employees in each team
        for employee_id, team_id in employees:
            team_counts[team_id] += 1

        # Build result
        result = []

        for employee_id, team_id in employees:
            result.append([employee_id, team_counts[team_id]])

        return result

The implementation follows the same logic as the optimal SQL approach.

The first loop counts how many employees belong to each team_id. A Counter is used because it provides a convenient frequency map structure where keys are team IDs and values are team sizes.

After building the frequency map, the second loop iterates through the employees again and appends the corresponding team size for each employee.

The final result contains pairs of:

[employee_id, team_size]

This mirrors the expected SQL output structure.

Go Solution

package main

func findTeamSize(employees [][]int) [][]int {
	teamCounts := make(map[int]int)

	// Count employees in each team
	for _, employee := range employees {
		teamID := employee[1]
		teamCounts[teamID]++
	}

	// Build result
	result := [][]int{}

	for _, employee := range employees {
		employeeID := employee[0]
		teamID := employee[1]

		result = append(result, []int{
			employeeID,
			teamCounts[teamID],
		})
	}

	return result
}

The Go implementation closely mirrors the Python version. A map[int]int is used to store team frequencies. Go slices are used to construct the result dynamically.

Unlike Python's Counter, Go requires explicit map initialization and manual incrementing. Other than syntax differences, the algorithm remains identical.

Because all values are small integer counts, integer overflow is not a concern in normal problem constraints.

SQL Solution

SELECT
    e.employee_id,
    t.team_size
FROM Employee e
JOIN (
    SELECT
        team_id,
        COUNT(*) AS team_size
    FROM Employee
    GROUP BY team_id
) t
ON e.team_id = t.team_id;

This solution first computes the size of every team using GROUP BY. The derived table is then joined back to the original Employee table to associate each employee with their corresponding team size.

Worked Examples

Example 1

Input table:

employee_id team_id
1 8
2 8
3 8
4 7
5 9
6 9

Step 1: Group by team_id

We count employees in each team.

team_id count
8 3
7 1
9 2

This creates the mapping:

{
    8: 3,
    7: 1,
    9: 2
}

Step 2: Assign counts back to employees

employee_id team_id team_size
1 8 3
2 8 3
3 8 3
4 7 1
5 9 2
6 9 2

Final output:

employee_id team_size
1 3
2 3
3 3
4 1
5 2
6 2

Complexity Analysis

Measure Complexity Explanation
Time O(n) One pass to count teams, one pass to build results
Space O(n) Hash map stores team frequencies

The algorithm processes each employee a constant number of times. The frequency map contains at most one entry per unique team, which in the worst case could equal the number of employees.

The SQL version is similarly efficient because aggregation and joins are optimized database operations.

Test Cases

from collections import Counter
from typing import List, Dict

class Solution:
    def findTeamSize(self, employees: List[List[int]]) -> List[List[int]]:
        team_counts: Dict[int, int] = Counter()

        for employee_id, team_id in employees:
            team_counts[team_id] += 1

        result = []

        for employee_id, team_id in employees:
            result.append([employee_id, team_counts[team_id]])

        return result

solution = Solution()

# Provided example
assert solution.findTeamSize([
    [1, 8],
    [2, 8],
    [3, 8],
    [4, 7],
    [5, 9],
    [6, 9]
]) == [
    [1, 3],
    [2, 3],
    [3, 3],
    [4, 1],
    [5, 2],
    [6, 2]
]  # multiple teams of different sizes

# Single employee
assert solution.findTeamSize([
    [1, 1]
]) == [
    [1, 1]
]  # smallest possible input

# All employees in same team
assert solution.findTeamSize([
    [1, 5],
    [2, 5],
    [3, 5],
    [4, 5]
]) == [
    [1, 4],
    [2, 4],
    [3, 4],
    [4, 4]
]  # everyone shares same team

# Every employee in unique team
assert solution.findTeamSize([
    [1, 1],
    [2, 2],
    [3, 3]
]) == [
    [1, 1],
    [2, 1],
    [3, 1]
]  # all team sizes equal 1

# Mixed team sizes
assert solution.findTeamSize([
    [10, 100],
    [20, 200],
    [30, 100],
    [40, 300],
    [50, 300],
    [60, 300]
]) == [
    [10, 2],
    [20, 1],
    [30, 2],
    [40, 3],
    [50, 3],
    [60, 3]
]  # verifies multiple distinct groups
Test Why
Example input Verifies standard multi-team scenario
Single employee Validates minimum input size
All same team Ensures large shared counts work correctly
All unique teams Ensures singleton teams return size 1
Mixed team sizes Verifies multiple independent group counts

Edge Cases

Team with Only One Employee

A common bug is incorrectly assuming every team contains multiple employees. If a team appears only once in the table, its count must still be computed correctly as 1.

The implementation handles this naturally because the counting phase increments frequencies exactly once per occurrence. A team appearing once simply gets count 1.

All Employees Belong to the Same Team

Another edge case occurs when every employee shares the same team_id. A naive implementation might repeatedly recompute the same count for every employee, leading to unnecessary work.

The optimized approach avoids this problem because the team count is computed only once and reused for all employees.

Every Employee Has a Unique Team

In this case, every team size equals 1. This can expose bugs where developers mistakenly aggregate incorrectly or assume duplicate teams always exist.

The frequency map correctly creates one entry per unique team and assigns a count of 1 to each employee.

Large Number of Employees

For large datasets, the brute force approach becomes inefficient because it repeatedly scans the entire table.

The optimized solution scales efficiently because it only performs linear passes over the data and uses constant-time hash map lookups.