LeetCode 2003 - Smallest Missing Genetic Value in Each Subtree
This problem asks us to compute, for every node in a rooted tree, the smallest positive integer that does not appear in the subtree rooted at that node. We are given two arrays: parents describes the tree structure, and nums contains distinct genetic values assigned to each node.
Difficulty: 🔴 Hard
Topics: Array, Dynamic Programming, Tree, Depth-First Search, Union-Find
Solution
Problem Understanding
This problem asks us to compute, for every node in a rooted tree, the smallest positive integer that does not appear in the subtree rooted at that node. We are given two arrays: parents describes the tree structure, and nums contains distinct genetic values assigned to each node. Node 0 is the root, and all nodes are numbered from 0 to n - 1.
The input parents tells us, for each node, who its parent is. The input nums assigns each node a unique integer in the range [1, 10^5]. The output is an array of length n where ans[i] is the smallest missing number in the subtree rooted at node i.
The key constraints to note are the size of the tree (n can be up to 10^5) and the value range of nums (1 to 10^5). These constraints make brute-force solutions that scan every subtree too slow. Additionally, the uniqueness of nums[i] simplifies checking for missing values because we do not need to handle duplicates.
Important edge cases include trees where 1 is not present at all, trees where the maximum number is at the root, and linear chains where the missing value propagates through ancestors.
Approaches
A naive approach would be to, for each node, traverse its subtree and collect all genetic values into a set. Then, for each node, we would scan from 1 upwards to find the first missing number. While this approach is correct, it requires traversing all descendants for every node, leading to a worst-case complexity of O(n^2) for large trees, which is infeasible given n <= 10^5.
The optimal approach leverages the observation that the missing values propagate from the leaf where nums[i] = 1 exists. Only the path from the node containing 1 to the root will have missing values greater than 1. Therefore, we can first detect if 1 exists; if it does not, every node has 1 as the smallest missing value. Otherwise, we can incrementally mark the numbers present in the subtree as we move from the node containing 1 up to the root, keeping track of the smallest missing number dynamically. Using a set or boolean array for presence allows this propagation efficiently, reducing redundant subtree traversals.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^2) | O(n) | Collect all values in subtree for every node, scan for smallest missing |
| Optimal | O(n + maxVal) | O(n + maxVal) | Track presence along the path from node containing 1 up to root, dynamically update missing values |
Algorithm Walkthrough
-
Build an adjacency list
treefrom theparentsarray to represent the tree structure. This allows quick access to children for DFS traversal. -
Initialize the answer array
answith all values as1because1is the minimum possible missing value. -
If
1does not exist innums, returnansimmediately since1is missing for all subtrees. -
If
1exists, find its indexstart. This is the node where the propagation of missing values begins. -
Initialize a boolean array
seento mark which genetic values have been observed. Initially, all values are unmarked. -
Start from node
startand traverse up the tree to the root: -
For the current node, mark its
nums[node]as seen. -
Recursively mark all numbers in its children subtrees (excluding already processed ancestors) as seen.
-
Incrementally update the smallest missing number
mexuntil you find a number that has not been seen. -
Set
ans[node] = mex. -
Continue this process until reaching the root.
-
Return
ansarray.
Why it works: By only processing the path from the node containing 1 to the root and marking seen numbers in all subtrees of that path, we guarantee that all relevant numbers are considered exactly once. Nodes outside this path always have 1 as the missing value, which satisfies the correctness.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def smallestMissingValueSubtree(self, parents: List[int], nums: List[int]) -> List[int]:
n = len(parents)
tree = defaultdict(list)
for i, p in enumerate(parents):
if p != -1:
tree[p].append(i)
ans = [1] * n
num_to_node = {num: i for i, num in enumerate(nums)}
if 1 not in num_to_node:
return ans
seen = [False] * (10**5 + 2)
start = num_to_node[1]
mex = 1
def dfs(node: int):
if seen[nums[node]]:
return
seen[nums[node]] = True
for child in tree[node]:
dfs(child)
node = start
while node != -1:
dfs(node)
while seen[mex]:
mex += 1
ans[node] = mex
node = parents[node]
return ans
The code first constructs the tree, maps values to nodes, and checks if 1 exists. If 1 is absent, the answer is trivial. Otherwise, we use a DFS to mark seen values in subtrees along the path from the node containing 1 to the root. The mex variable tracks the smallest missing value. For each ancestor, after processing the subtree, we update mex and record it in ans.
Go Solution
package main
func smallestMissingValueSubtree(parents []int, nums []int) []int {
n := len(parents)
tree := make([][]int, n)
for i := 0; i < n; i++ {
if parents[i] != -1 {
tree[parents[i]] = append(tree[parents[i]], i)
}
}
ans := make([]int, n)
for i := range ans {
ans[i] = 1
}
numToNode := make(map[int]int)
for i, num := range nums {
numToNode[num] = i
}
if _, ok := numToNode[1]; !ok {
return ans
}
seen := make([]bool, 100002)
mex := 1
start := numToNode[1]
var dfs func(int)
dfs = func(node int) {
if seen[nums[node]] {
return
}
seen[nums[node]] = true
for _, child := range tree[node] {
dfs(child)
}
}
node := start
for node != -1 {
dfs(node)
for seen[mex] {
mex++
}
ans[node] = mex
node = parents[node]
}
return ans
}
The Go solution mirrors the Python approach. Slice initialization, map handling, and recursion are adapted to Go syntax. Slices are used for the adjacency list and seen array. The same logic for DFS and mex updating applies.
Worked Examples
Example 1: parents = [-1,0,0,2], nums = [1,2,3,4]
| Step | Node | DFS Marks Seen | Mex | ans |
|---|---|---|---|---|
| Start at 1 | 0 | [1,2,3,4] | 5 | ans[0]=5 |
| Node 1 | 1 | [1,2,3,4] | 1 | ans[1]=1 |
| Node 2 | 2 | [1,2,3,4] | 1 | ans[2]=1 |
| Node 3 | 3 | [1,2,3,4] | 1 | ans[3]=1 |
Example 2: parents = [-1,0,1,0,3,3], nums = [5,4,6,2,1,3]
Similar DFS propagation updates mex incrementally along the path from node containing 1 to root, producing [7,1,1,4,2,1].
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + maxVal) | Each node is visited once along the path from node 1 to root, and each number up to 10^5 is processed once in seen. |
| Space | O(n + maxVal) | Tree adjacency list, answer array, and seen array. |
The complexity is dominated by the seen array of size 10^5 and traversal along the path from node 1 to root.
Test Cases
# Basic examples
assert Solution().smallestMissingValueSubtree([-1,0,0,2], [1,2,3,4]) == [5,1,1,1]
assert Solution().smallestMissingValueSubtree([-1,0,1,0,3,3], [5,4,6,2,1,3]) == [7,1,1,4,2,1]
assert Solution().smallestMissingValueSubtree([-1,2,3,