LeetCode 582 - Kill Process

Difficulty: 🟡 Medium
Topics: Array, Hash Table, Tree, Depth-First Search, Breadth-First Search

Solution

Problem Understanding

This problem models processes in an operating system as a tree structure. Every process has exactly one parent, except for the root process, which has no parent and is identified by ppid[i] = 0.

The input is given through two parallel arrays:

pid[i] represents the ID of a process
ppid[i] represents the parent process ID of that same process

Together, these arrays describe a rooted tree where each node is a process.

When a process is killed, all of its descendants must also be killed. That means if we kill a node in the tree, every node in its subtree must also be included in the result.

The task is to return all process IDs that will be terminated when the given kill process is killed. The output order does not matter.

For example, suppose we have:

pid  = [1,3,10,5]
ppid = [3,0,5,3]

This represents the following hierarchy:

3
├── 1
└── 5
    └── 10

If we kill process 5, then process 10 must also be killed because it is a child of 5.

The constraints are important:

n can be as large as 5 * 10^4
Process IDs are unique
Every process has exactly one parent
The input always forms a valid rooted tree
The kill process always exists

These constraints tell us several things:

We need an algorithm close to linear time
Repeated scanning of the arrays will likely be too slow
Tree traversal techniques such as DFS or BFS are appropriate
We should preprocess the input into a structure that supports fast child lookup

Several edge cases are important:

Killing the root process, which should kill every process
Killing a leaf node, which should return only that node
A tree shaped like a long chain, which can stress recursive DFS depth
A process with many direct children
The smallest possible input, where only one process exists

The problem guarantees the input is valid, so we do not need to handle cycles or disconnected components.

Approaches

Brute Force Approach

A naive solution would repeatedly scan the entire ppid array to find children of the process being killed.

For example:

Start with the kill process
Scan all processes to find its children
Add those children to the result
For each discovered child, scan the arrays again to find its children
Continue until no more descendants are found

This works because every descendant is eventually discovered through repeated parent-child matching.

However, this approach is inefficient because every time we process a node, we potentially scan the entire array again. In the worst case, this becomes quadratic.

If there are n processes and the tree is skewed, we may perform O(n) scans for O(n) nodes, leading to O(n^2) time complexity.

With up to 50,000 processes, this can become too slow.

Optimal Approach

The key observation is that the input naturally describes a tree, but the arrays are not organized for efficient traversal.

The important optimization is to first build an adjacency list:

parent -> list of children

Once we have this structure, we can efficiently traverse the subtree rooted at kill.

This transforms the problem into a standard tree traversal problem.

We can use either:

Depth-First Search (DFS)
Breadth-First Search (BFS)

Both work equally well because the output order does not matter.

The algorithm becomes:

Build a hash map from parent process to child processes
Start traversal from kill
Visit every reachable descendant
Add visited processes to the answer

Because each node and edge is processed only once, the complexity becomes linear.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n²)	O(n)	Repeatedly scans entire arrays to find children
Optimal	O(n)	O(n)	Builds adjacency list and traverses subtree once

Algorithm Walkthrough

Create a hash map called children_map where each key is a parent process ID and the value is a list of its child process IDs.

This preprocessing step converts the raw arrays into an efficient tree representation. Without this structure, finding children requires repeatedly scanning the entire input. 2. Iterate through the pid and ppid arrays simultaneously.

For every process:

parent = ppid[i]
child = pid[i]

Append the child to the parent's list in children_map.

After this step, we can instantly retrieve all children of any process. 3. Initialize a traversal structure.

We can use:

A stack for DFS
A queue for BFS

In this solution, we use a stack because it is simple and efficient. 4. Push the kill process onto the stack.

This represents the starting point of the subtree traversal. 5. While the stack is not empty:

Pop one process from the stack
Add it to the result list
Retrieve all of its children from children_map
Push those children onto the stack

This guarantees we eventually visit every descendant of the killed process. 6. Return the result list after traversal finishes.

At this point, every process in the subtree rooted at kill has been collected.

Why it works

The algorithm works because the process hierarchy forms a tree. Every descendant of kill is reachable through repeated parent-to-child traversal starting from kill.

The adjacency list guarantees efficient access to children, and the DFS traversal guarantees that every node in the subtree is visited exactly once.

Since killing a process recursively kills all descendants, the subtree rooted at kill is exactly the set of processes we must return.

Python Solution

from collections import defaultdict
from typing import List

class Solution:
    def killProcess(self, pid: List[int], ppid: List[int], kill: int) -> List[int]:
        children_map = defaultdict(list)

        # Build parent -> children mapping
        for child, parent in zip(pid, ppid):
            children_map[parent].append(child)

        result = []
        stack = [kill]

        # DFS traversal
        while stack:
            current = stack.pop()
            result.append(current)

            for child in children_map[current]:
                stack.append(child)

        return result

The implementation begins by constructing an adjacency list using defaultdict(list). This allows us to efficiently store all children for each parent process.

The loop:

for child, parent in zip(pid, ppid):

pairs corresponding elements from the input arrays and builds the tree representation.

After preprocessing, the algorithm performs iterative DFS using a stack.

The stack starts with the kill process because that is the root of the subtree we need to remove.

Inside the traversal loop:

current = stack.pop()

retrieves the next process to process.

We immediately add it to the result because any visited node represents a process that will be killed.

Then we retrieve all children of the current process and push them onto the stack for future processing.

The traversal continues until no reachable descendants remain.

Finally, the result list is returned.

Using iterative DFS instead of recursive DFS avoids recursion depth issues for very deep trees.

Go Solution

func killProcess(pid []int, ppid []int, kill int) []int {
	childrenMap := make(map[int][]int)

	// Build parent -> children mapping
	for i := 0; i < len(pid); i++ {
		parent := ppid[i]
		child := pid[i]

		childrenMap[parent] = append(childrenMap[parent], child)
	}

	result := []int{}
	stack := []int{kill}

	// DFS traversal
	for len(stack) > 0 {
		// Pop from stack
		n := len(stack) - 1
		current := stack[n]
		stack = stack[:n]

		result = append(result, current)

		for _, child := range childrenMap[current] {
			stack = append(stack, child)
		}
	}

	return result
}

The Go implementation follows the same algorithmic structure as the Python version.

The adjacency list is implemented using:

map[int][]int

which maps each parent process ID to a slice of child process IDs.

The DFS stack is implemented using a slice. Popping from the end of the slice gives efficient stack behavior.

One difference from Python is that Go does not have a built-in defaultdict, so appending to a nonexistent map key relies on Go's zero-value behavior for slices.

Another difference is explicit slice manipulation during stack popping:

stack = stack[:n]

This removes the last element after retrieving it.

There are no integer overflow concerns because process IDs are bounded by 5 * 10^4.

Worked Examples

Example 1

pid = [1,3,10,5]
ppid = [3,0,5,3]
kill = 5

Step 1: Build the adjacency list

Parent	Children
0	[3]
3	[1, 5]
5	[10]

Step 2: Initialize traversal

Variable	Value
stack	[5]
result	[]

Step 3: DFS traversal

Action	Stack	Result
Pop 5	[]	[5]
Push child 10	[10]	[5]
Pop 10	[]	[5, 10]

Traversal ends because the stack is empty.

Final answer:

[5, 10]

Example 2

pid = [1]
ppid = [0]
kill = 1

Step 1: Build adjacency list

Parent	Children
0	[1]

Step 2: Initialize traversal

Variable	Value
stack	[1]
result	[]

Step 3: DFS traversal

Action	Stack	Result
Pop 1	[]	[1]

Traversal ends.

Final answer:

[1]

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Each process is added to the adjacency list once and visited once
Space	O(n)	The adjacency list, stack, and result list may all store up to n processes

The preprocessing step iterates through the input arrays once, which costs O(n).

The DFS traversal also visits each process at most once, contributing another O(n).

Because these operations are sequential rather than nested, the total complexity remains linear.

The adjacency list stores all parent-child relationships, requiring O(n) space. The traversal stack and result list can also grow to size n in the worst case.

Test Cases

from typing import List
from collections import defaultdict

class Solution:
    def killProcess(self, pid: List[int], ppid: List[int], kill: int) -> List[int]:
        children_map = defaultdict(list)

        for child, parent in zip(pid, ppid):
            children_map[parent].append(child)

        result = []
        stack = [kill]

        while stack:
            current = stack.pop()
            result.append(current)

            for child in children_map[current]:
                stack.append(child)

        return result

sol = Solution()

# Example 1
assert sorted(sol.killProcess([1,3,10,5], [3,0,5,3], 5)) == [5,10]

# Example 2
assert sorted(sol.killProcess([1], [0], 1)) == [1]

# Kill root process, entire tree removed
assert sorted(sol.killProcess([1,2,3,4], [0,1,1,2], 1)) == [1,2,3,4]

# Kill leaf node, only itself removed
assert sorted(sol.killProcess([1,2,3,4], [0,1,1,2], 4)) == [4]

# Long chain structure
assert sorted(sol.killProcess([1,2,3,4,5], [0,1,2,3,4], 3)) == [3,4,5]

# Wide tree with many children
assert sorted(sol.killProcess([1,2,3,4,5], [0,1,1,1,1], 1)) == [1,2,3,4,5]

# Internal subtree removal
assert sorted(sol.killProcess([1,2,3,4,5,6], [0,1,1,2,2,3], 2)) == [2,4,5]

# Single child subtree
assert sorted(sol.killProcess([1,2], [0,1], 2)) == [2]

# Unordered pid input
assert sorted(sol.killProcess([5,1,3,10], [3,3,0,5], 5)) == [5,10]

Test	Why
Example 1	Validates standard subtree traversal
Example 2	Validates smallest possible input
Kill root process	Ensures entire tree is returned
Kill leaf node	Ensures no extra nodes are included
Long chain structure	Tests deep traversal behavior
Wide tree	Tests handling many direct children
Internal subtree removal	Verifies selective subtree killing
Single child subtree	Tests minimal non-root subtree
Unordered pid input	Ensures algorithm does not rely on ordering

Edge Cases

One important edge case occurs when the killed process is the root of the tree. In this situation, every process in the system should be terminated because all nodes are descendants of the root. A buggy implementation might accidentally omit deeper descendants or stop traversal early. The DFS traversal correctly visits the entire tree because it continues until the stack becomes empty.

Another important edge case is killing a leaf node. A leaf has no children, so the answer should contain only the node itself. Some implementations incorrectly assume every node has children and may attempt invalid lookups. Using defaultdict(list) in Python and Go's zero-value slices ensures missing child lists are handled safely as empty collections.

A deep chain-shaped tree is another critical case. Consider a structure like:

1 -> 2 -> 3 -> 4 -> 5

If recursive DFS were used, extremely deep trees could risk stack overflow in some languages. The iterative DFS implementation avoids recursion entirely by using an explicit stack data structure, making it safe for the maximum constraint size.

Another subtle edge case is unordered input. The pid and ppid arrays are not guaranteed to appear in hierarchical order. A naive implementation that assumes parents appear before children could fail. Building the adjacency list independently of ordering guarantees correctness regardless of input arrangement.