LeetCode 3331 - Find Subtree Sizes After Changes
We are given a rooted tree with n nodes numbered from 0 to n - 1. The tree structure is represented by the parent array, where parent[i] is the parent of node i. Since node 0 is the root, its parent is -1.
Difficulty: š” Medium
Topics: Array, Hash Table, String, Tree, Depth-First Search
Solution
Problem Understanding
We are given a rooted tree with n nodes numbered from 0 to n - 1. The tree structure is represented by the parent array, where parent[i] is the parent of node i. Since node 0 is the root, its parent is -1.
Each node also has a character assigned to it through the string s, where s[i] is the character for node i.
The problem performs a simultaneous restructuring operation on the tree. For every node x from 1 to n - 1, we search upward through its ancestors and find the closest ancestor y such that:
yis an ancestor ofxs[y] == s[x]
If such an ancestor exists, node x disconnects from its current parent and reconnects directly under y.
The important detail is that all changes happen simultaneously. This means every node decides its new parent using the original tree structure, not the partially modified tree.
After all parent changes are completed, we must compute the subtree size for every node in the final tree. The subtree size of a node includes the node itself and all descendants below it.
The constraints are important:
ncan be as large as10^5- The input is guaranteed to form a valid tree
- Characters are lowercase English letters
These constraints immediately rule out expensive algorithms such as repeatedly traversing ancestors for every node in a naive way.
Several edge cases are especially important:
- Nodes may not have any matching ancestor
- Multiple nodes may simultaneously change parents
- The root node never changes parent
- Chains of repeated characters can create deep ancestor relationships
- The final tree structure may differ significantly from the original tree
Because the operation is simultaneous, we must be careful not to update the tree incrementally while determining new parents.
Approaches
Brute Force Approach
A straightforward approach is to process every node independently.
For each node x, we repeatedly move upward through its ancestors using the parent array until we either:
- find the closest ancestor with the same character, or
- reach the root
Once we determine the new parent for every node, we construct the final tree and run a DFS to compute subtree sizes.
This approach is correct because it directly follows the problem definition. However, its performance is too slow in the worst case.
Consider a degenerate tree shaped like a linked list. For every node, we may need to traverse all ancestors. This leads to:
O(n)work per nodeO(n^2)total complexity
With n = 10^5, this is infeasible.
Optimal Approach
The key observation is that while traversing the tree with DFS, we already know the current ancestor path.
Instead of searching upward repeatedly for every node, we maintain the most recent ancestor for each character during DFS traversal.
We use:
- an adjacency list for the tree
- a stack for each character
When visiting a node:
- the top of the stack for its character represents the closest ancestor with the same character
- we record that node as the new parent
- we push the current node onto the stack
- after processing children, we pop it back out
This transforms repeated ancestor searches into constant-time lookups.
After determining all final parents, we build the final tree and compute subtree sizes with another DFS.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | For each node, scan all ancestors upward |
| Optimal | O(n) | O(n) | DFS with character stacks enables constant-time ancestor lookup |
Algorithm Walkthrough
- Build the original tree using an adjacency list.
Since the input provides only parent references, we first convert it into a child-based adjacency list. This allows efficient DFS traversal from the root downward.
2. Create an array new_parent.
Initially, every node keeps its original parent. During DFS, we may update this value if we discover a closer matching ancestor. 3. Maintain stacks for characters.
We create a dictionary where each character maps to a stack of ancestor nodes currently active in the DFS path.
For example:
char_stack['a'] = [0, 3, 8]
means nodes 0, 3, and 8 are ancestors with character 'a', and node 8 is the closest one.
4. Perform DFS on the original tree.
For each node:
- Check whether its character stack already contains an ancestor.
- If yes, the top element is the closest matching ancestor.
- Set that node as the new parent.
- Push the current node onto its character stack.
- Recursively process children.
- Pop the node after finishing recursion.
This works because DFS naturally maintains the current ancestor path. 5. Build the final tree.
After determining all new parents, construct a new adjacency list representing the modified tree. 6. Compute subtree sizes.
Run another DFS on the final tree.
For each node:
- Start with size
1 - Add the subtree sizes of all children
- Store the result
- Return the subtree size array.
Why it works
During DFS, the character stack for a node's character always contains exactly the ancestors on the current root-to-node path with that character. Because DFS explores downward before backtracking, the top of the stack is always the closest matching ancestor.
Since every node determines its new parent using the original DFS traversal before any structural modifications are applied, the simultaneous-update requirement is satisfied.
The second DFS correctly computes subtree sizes because the final adjacency list exactly represents the modified tree.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def findSubtreeSizes(self, parent: List[int], s: str) -> List[int]:
n = len(parent)
# Build original tree
original_tree = [[] for _ in range(n)]
for node in range(1, n):
original_tree[parent[node]].append(node)
# Store final parent after modifications
new_parent = parent[:]
# Character stacks
char_stack = defaultdict(list)
def dfs_find(node: int) -> None:
char = s[node]
# Closest ancestor with same character
if char_stack[char]:
new_parent[node] = char_stack[char][-1]
# Add current node to stack
char_stack[char].append(node)
# Process children
for child in original_tree[node]:
dfs_find(child)
# Backtrack
char_stack[char].pop()
# Root cannot change parent
char_stack[s[0]].append(0)
for child in original_tree[0]:
dfs_find(child)
# Build final tree
final_tree = [[] for _ in range(n)]
for node in range(1, n):
final_tree[new_parent[node]].append(node)
# Compute subtree sizes
answer = [0] * n
def dfs_size(node: int) -> int:
size = 1
for child in final_tree[node]:
size += dfs_size(child)
answer[node] = size
return size
dfs_size(0)
return answer
The implementation begins by constructing the original tree as an adjacency list. This allows efficient DFS traversal from the root downward.
The new_parent array stores the final parent relationship after all simultaneous modifications. Initially, every node keeps its original parent.
The char_stack dictionary is the core optimization. Each character maintains a stack of currently active ancestors during DFS traversal. Because DFS always explores along a single root-to-current-node path, the stack top represents the nearest ancestor with that character.
The dfs_find function determines the final parent for every node. Before descending into children, the current node is pushed onto its character stack. After recursion finishes, the node is removed to restore the correct ancestor state during backtracking.
Once all final parents are known, the code builds the final tree structure and runs a second DFS to compute subtree sizes.
The dfs_size function uses standard postorder DFS. Each node contributes 1 for itself plus the sizes of all child subtrees.
Go Solution
package main
func findSubtreeSizes(parent []int, s string) []int {
n := len(parent)
// Build original tree
originalTree := make([][]int, n)
for node := 1; node < n; node++ {
p := parent[node]
originalTree[p] = append(originalTree[p], node)
}
// Store final parent
newParent := make([]int, n)
copy(newParent, parent)
// Character stacks
charStacks := make(map[byte][]int)
var dfsFind func(int)
dfsFind = func(node int) {
ch := s[node]
// Closest ancestor with same character
if len(charStacks[ch]) > 0 {
stack := charStacks[ch]
newParent[node] = stack[len(stack)-1]
}
// Push current node
charStacks[ch] = append(charStacks[ch], node)
// Process children
for _, child := range originalTree[node] {
dfsFind(child)
}
// Pop current node
stack := charStacks[ch]
charStacks[ch] = stack[:len(stack)-1]
}
// Root cannot change parent
charStacks[s[0]] = append(charStacks[s[0]], 0)
for _, child := range originalTree[0] {
dfsFind(child)
}
// Build final tree
finalTree := make([][]int, n)
for node := 1; node < n; node++ {
p := newParent[node]
finalTree[p] = append(finalTree[p], node)
}
// Compute subtree sizes
answer := make([]int, n)
var dfsSize func(int) int
dfsSize = func(node int) int {
size := 1
for _, child := range finalTree[node] {
size += dfsSize(child)
}
answer[node] = size
return size
}
dfsSize(0)
return answer
}
The Go implementation closely mirrors the Python solution. Since Go strings are byte-indexable for ASCII lowercase letters, characters are handled using byte.
The character stacks are implemented as map[byte][]int, where each slice behaves like a stack. Push operations use append, and pop operations use slice truncation.
Unlike Python lists, Go slices require explicit copying when duplicating the parent array, so copy(newParent, parent) is used.
No integer overflow concerns exist because subtree sizes are at most 10^5, which easily fits inside Go's int.
Worked Examples
Example 1
parent = [-1,0,0,1,1,1]
s = "abaabc"
Original Tree
0(a)
āāā 1(b)
ā āāā 3(a)
ā āāā 4(b)
ā āāā 5(c)
āāā 2(a)
DFS Traversal
| Current Node | Character | Stack Before | Closest Matching Ancestor | New Parent |
|---|---|---|---|---|
| 1 | b | [] | None | 0 |
| 3 | a | [0] | 0 | 0 |
| 4 | b | [1] | 1 | 1 |
| 5 | c | [] | None | 1 |
| 2 | a | [0] | 0 | 0 |
Final Tree
0
āāā 1
ā āāā 4
ā āāā 5
āāā 2
āāā 3
Subtree Sizes
| Node | Subtree Nodes | Size |
|---|---|---|
| 0 | [0,1,2,3,4,5] | 6 |
| 1 | [1,4,5] | 3 |
| 2 | [2] | 1 |
| 3 | [3] | 1 |
| 4 | [4] | 1 |
| 5 | [5] | 1 |
Final answer:
[6,3,1,1,1,1]
Example 2
parent = [-1,0,4,0,1]
s = "abbba"
Original Tree
0(a)
āāā 1(b)
ā āāā 4(a)
ā āāā 2(b)
āāā 3(b)
DFS Traversal
| Current Node | Character | Stack Before | Closest Matching Ancestor | New Parent |
|---|---|---|---|---|
| 1 | b | [] | None | 0 |
| 4 | a | [0] | 0 | 0 |
| 2 | b | [1] | 1 | 1 |
| 3 | b | [] | None | 0 |
Final Tree
0
āāā 1
ā āāā 2
āāā 3
āāā 4
Subtree Sizes
| Node | Subtree Nodes | Size |
|---|---|---|
| 0 | [0,1,2,3,4] | 5 |
| 1 | [1,2] | 2 |
| 2 | [2] | 1 |
| 3 | [3] | 1 |
| 4 | [4] | 1 |
Final answer:
[5,2,1,1,1]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each node is visited a constant number of times |
| Space | O(n) | Adjacency lists, stacks, recursion, and result arrays |
The algorithm performs two DFS traversals:
- one to determine new parents
- one to compute subtree sizes
Each node is pushed and popped from a character stack exactly once, so stack operations contribute only linear total work.
The adjacency lists together store exactly n - 1 edges, which is also linear space.
Test Cases
from typing import List
from collections import defaultdict
class Solution:
def findSubtreeSizes(self, parent: List[int], s: str) -> List[int]:
n = len(parent)
original_tree = [[] for _ in range(n)]
for node in range(1, n):
original_tree[parent[node]].append(node)
new_parent = parent[:]
char_stack = defaultdict(list)
def dfs_find(node: int) -> None:
char = s[node]
if char_stack[char]:
new_parent[node] = char_stack[char][-1]
char_stack[char].append(node)
for child in original_tree[node]:
dfs_find(child)
char_stack[char].pop()
char_stack[s[0]].append(0)
for child in original_tree[0]:
dfs_find(child)
final_tree = [[] for _ in range(n)]
for node in range(1, n):
final_tree[new_parent[node]].append(node)
answer = [0] * n
def dfs_size(node: int) -> int:
size = 1
for child in final_tree[node]:
size += dfs_size(child)
answer[node] = size
return size
dfs_size(0)
return answer
sol = Solution()
# Example 1
assert sol.findSubtreeSizes(
[-1,0,0,1,1,1],
"abaabc"
) == [6,3,1,1,1,1] # basic restructuring
# Example 2
assert sol.findSubtreeSizes(
[-1,0,4,0,1],
"abbba"
) == [5,2,1,1,1] # simultaneous updates
# Single node tree
assert sol.findSubtreeSizes(
[-1],
"a"
) == [1] # minimum size input
# No matching ancestors
assert sol.findSubtreeSizes(
[-1,0,1,2],
"abcd"
) == [4,3,2,1] # tree unchanged
# All characters identical
assert sol.findSubtreeSizes(
[-1,0,1,2,3],
"aaaaa"
) == [5,4,3,2,1] # every node keeps closest parent
# Deep chain with restructuring
assert sol.findSubtreeSizes(
[-1,0,1,2,3],
"ababa"
) == [5,2,2,1,1] # alternating ancestor matches
# Star shaped tree
assert sol.findSubtreeSizes(
[-1,0,0,0,0],
"abcda"
) == [5,1,1,1,1] # direct root matching
# Multiple independent subtrees
assert sol.findSubtreeSizes(
[-1,0,0,1,1,2,2],
"abacaba"
) == [7,3,3,1,1,1,1] # multiple character groups
print("All tests passed.")
| Test | Why |
|---|---|
| Example 1 | Verifies standard restructuring |
| Example 2 | Verifies simultaneous parent updates |
| Single node tree | Validates minimum constraints |
| No matching ancestors | Ensures unchanged tree works correctly |
| All characters identical | Tests repeated ancestor matching |
| Deep chain with restructuring | Validates closest ancestor selection |
| Star shaped tree | Tests shallow tree structure |
| Multiple independent subtrees | Verifies correctness across branches |
Edge Cases
Single Node Tree
A tree with only the root node is the smallest valid input. Since the root has no parent and no ancestors, no restructuring occurs.
This case can expose bugs involving empty adjacency lists or incorrect DFS initialization. The implementation handles it naturally because both DFS traversals simply process node 0 once.
No Matching Ancestors
If every node has a unique character compared to all ancestors, then no node changes parent.
A buggy implementation might accidentally assign incorrect parents or fail to preserve the original structure. This solution initializes new_parent as a copy of parent, ensuring nodes remain unchanged unless a valid matching ancestor is found.
Multiple Matching Ancestors
A node may have several ancestors with the same character. The problem specifically requires the closest one.
For example:
0(a)
|
1(a)
|
2(a)
|
3(a)
Node 3 must connect to node 2, not node 0.
This is exactly why the stack structure is important. The top of the stack always represents the nearest active ancestor with that character.
Simultaneous Updates
Parent changes must all happen simultaneously using the original tree.
A common mistake is modifying the tree immediately while processing nodes. That would incorrectly influence later ancestor searches.
This implementation avoids the issue entirely by:
- first computing all new parents using the original tree
- only afterward constructing the final tree
Therefore, every node makes its decision against the unchanged original structure.