LeetCode 1731 - The Number of Employees Which Report to Each Employee

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

The Employees table stores information about employees inside a company. Each row represents one employee and contains four columns:

Column	Meaning
`employee_id`	Unique identifier for the employee
`name`	Employee name
`reports_to`	The manager this employee reports to
`age`	Employee age

If reports_to is NULL, that employee does not report to anyone.

The problem defines a manager as any employee who has at least one direct report. A direct report means an employee whose reports_to value equals that manager's employee_id.

We need to produce a result table containing:

Column	Meaning
`employee_id`	Manager ID
`name`	Manager name
`reports_count`	Number of employees directly reporting to the manager
`average_age`	Average age of direct reports, rounded to nearest integer

The final result must be sorted by employee_id.

The important detail is that we only consider direct reports, not indirect reports. For example, if employee A manages B, and B manages C, then C does not count toward A's report count.

The input size is relatively small for a database problem, and the main challenge is understanding how to aggregate employee information correctly using SQL grouping operations.

Several edge cases are important:

Employees with NULL in reports_to are not automatically managers. They only become managers if another employee references them in reports_to.
Some employees may have exactly one direct report. The aggregation logic must still work correctly.
Some employees may never appear as managers because nobody reports to them. They should not appear in the output.
Average age must be rounded to the nearest integer, not truncated.
The relationship is only one level deep, so recursive logic is unnecessary.

Approaches

Brute Force Approach

A brute force strategy would examine every employee and then scan the entire table again to find employees reporting to them.

For each employee:

Iterate through all rows.
Count how many employees have reports_to = current_employee_id.
Sum their ages.
Compute the rounded average.
Include the employee in the result if at least one report exists.

This approach works because it explicitly checks every possible manager-report relationship. However, it is inefficient because for every employee we scan the entire table again.

If there are n employees, this leads to O(n^2) time complexity.

Optimal Approach

The key observation is that employees reporting to the same manager can be grouped together directly using SQL aggregation.

Instead of repeatedly scanning the table, we can:

Group employees by reports_to.
Count employees in each group.
Compute the average age in each group.
Join the grouped results back with the Employees table to retrieve manager names.

This is efficient because SQL databases are optimized for grouping and aggregation operations.

The grouping operation processes the table once, making the solution much more scalable.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n²)	O(1)	Repeatedly scans the table for every employee
Optimal	O(n)	O(n)	Uses SQL aggregation and grouping

Algorithm Walkthrough

Start with the Employees table.

Each employee row may reference a manager through the reports_to column. 2. Group employees by reports_to.

This creates one group per manager ID. Every group contains all employees directly reporting to that manager. 3. For each group:

Count the number of employees.
Compute the average age.
Round the average age to the nearest integer.

These values become reports_count and average_age. 4. Join the grouped result with the Employees table.

The grouped data only contains manager IDs. We join using:

Employees.employee_id = grouped.reports_to

This allows us to retrieve the manager's name. 5. Select the required columns.

We output:

employee_id
name
reports_count
average_age

Sort the result by employee_id.

The problem explicitly requires ascending order.

Why it works

The grouping operation guarantees that all employees sharing the same reports_to value are collected together. Since reports_to identifies the direct manager, every aggregation computed on that group correctly represents that manager's direct reports. Joining back with the employee table retrieves the corresponding manager information without changing the aggregated results.

Python Solution

Even though this is a database problem, LeetCode still expects the SQL query as the solution.

# Write your MySQL query statement below

SELECT
    m.employee_id,
    m.name,
    COUNT(e.employee_id) AS reports_count,
    ROUND(AVG(e.age)) AS average_age
FROM Employees e
JOIN Employees m
    ON e.reports_to = m.employee_id
GROUP BY m.employee_id, m.name
ORDER BY m.employee_id;

The query uses a self join because both employees and managers are stored in the same table.

The alias e represents regular employees. The alias m represents managers.

The join condition:

e.reports_to = m.employee_id

matches each employee with their manager.

After the join, employees are grouped by manager identity. The aggregation functions then compute:

COUNT(e.employee_id) for direct report count
AVG(e.age) for average report age

The ROUND function ensures the average age matches the problem requirement.

Finally, the output is ordered by manager ID.

Go Solution

Since this is a database problem, the Go submission is also simply the SQL query.

// Write your MySQL query statement below

SELECT
    m.employee_id,
    m.name,
    COUNT(e.employee_id) AS reports_count,
    ROUND(AVG(e.age)) AS average_age
FROM Employees e
JOIN Employees m
    ON e.reports_to = m.employee_id
GROUP BY m.employee_id, m.name
ORDER BY m.employee_id;

There are no Go-specific implementation concerns because the solution is entirely SQL based. The same query works regardless of the surrounding language wrapper used by LeetCode.

Worked Examples

Example 1

Input table:

employee_id	name	reports_to	age
9	Hercy	NULL	43
6	Alice	9	41
4	Bob	9	36
2	Winston	NULL	37

Step 1: Perform Self Join

We match employees to managers using:

e.reports_to = m.employee_id

Result:

Employee	Employee Age	Manager ID	Manager Name
Alice	41	9	Hercy
Bob	36	9	Hercy

Step 2: Group by Manager

Manager	Report Ages
Hercy	[41, 36]

Step 3: Aggregate

Manager	reports_count	average_age
Hercy	2	ROUND((41 + 36) / 2) = 39

Final Output

employee_id	name	reports_count	average_age
9	Hercy	2	39

Example 2

Input table:

employee_id	name	reports_to	age
1	Michael	NULL	45
2	Alice	1	38
3	Bob	1	42
4	Charlie	2	34
5	David	2	40
6	Eve	3	37
7	Frank	NULL	50
8	Grace	NULL	48

Step 1: Self Join

Employee	Age	Manager
Alice	38	Michael
Bob	42	Michael
Charlie	34	Alice
David	40	Alice
Eve	37	Bob

Step 2: Group by Manager

Manager	Report Ages
Michael	[38, 42]
Alice	[34, 40]
Bob	[37]

Step 3: Compute Aggregates

Manager	reports_count	average_age
Michael	2	ROUND(40.0) = 40
Alice	2	ROUND(37.0) = 37
Bob	1	ROUND(37.0) = 37

Final Output

employee_id	name	reports_count	average_age
1	Michael	2	40
2	Alice	2	37
3	Bob	1	37

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Each employee row is processed during grouping and aggregation
Space	O(n)	Grouping may store aggregation information for managers

The database engine scans the table and builds aggregation groups based on reports_to. Since each row contributes to exactly one group, the total work scales linearly with the number of employees.

Test Cases

# Example 1
employees = [
    [9, "Hercy", None, 43],
    [6, "Alice", 9, 41],
    [4, "Bob", 9, 36],
    [2, "Winston", None, 37]
]
expected = [
    [9, "Hercy", 2, 39]
]
assert expected == [[9, "Hercy", 2, 39]]  # Basic manager aggregation

# Example 2
employees = [
    [1, "Michael", None, 45],
    [2, "Alice", 1, 38],
    [3, "Bob", 1, 42],
    [4, "Charlie", 2, 34],
    [5, "David", 2, 40],
    [6, "Eve", 3, 37],
    [7, "Frank", None, 50],
    [8, "Grace", None, 48]
]
expected = [
    [1, "Michael", 2, 40],
    [2, "Alice", 2, 37],
    [3, "Bob", 1, 37]
]
assert len(expected) == 3  # Multiple managers

# Single employee, no manager
employees = [
    [1, "Alice", None, 30]
]
expected = []
assert expected == []  # No managers exist

# One manager with one report
employees = [
    [1, "Manager", None, 50],
    [2, "Worker", 1, 25]
]
expected = [
    [1, "Manager", 1, 25]
]
assert expected[0][2] == 1  # Single direct report

# Average rounding upward
employees = [
    [1, "Boss", None, 55],
    [2, "A", 1, 41],
    [3, "B", 1, 36]
]
expected = [
    [1, "Boss", 2, 39]
]
assert expected[0][3] == 39  # 38.5 rounds to 39

# Employee reporting chain
employees = [
    [1, "CEO", None, 60],
    [2, "Manager", 1, 40],
    [3, "Engineer", 2, 30]
]
expected = [
    [1, "CEO", 1, 40],
    [2, "Manager", 1, 30]
]
assert len(expected) == 2  # Only direct reports count

Test	Why
Example 1	Verifies basic grouping and rounding
Example 2	Verifies multiple managers and aggregation
Single employee	Ensures empty output when no managers exist
One direct report	Verifies count and average for single employee
Rounding case	Confirms proper nearest integer rounding
Reporting chain	Ensures only direct reports are counted

Edge Cases

One important edge case occurs when no employees report to anyone. In this situation, every reports_to value is NULL, so no groups are formed during aggregation. A buggy implementation might accidentally include all employees as managers. The join condition prevents this because employees without reports never participate in the joined result.

Another important edge case is when a manager has exactly one direct report. Aggregation logic sometimes assumes multiple rows exist, but SQL aggregate functions work correctly even with a single row. The COUNT becomes 1, and the average age is simply that employee's age.

A subtle edge case involves hierarchical management structures. For example, a CEO may indirectly manage many employees through intermediate managers. The problem only wants direct reports. Using reports_to = manager_id ensures only immediate relationships are counted, avoiding accidental recursive aggregation.

A final edge case is average rounding behavior. SQL databases may return decimal averages such as 38.5. The problem requires rounding to the nearest integer, not truncation. Using ROUND(AVG(age)) guarantees the correct output format.