LeetCode 3331 - Find Subtree Sizes After Changes

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Tree, Depth-First Search

Solution

Problem Understanding

We are given a rooted tree with n nodes numbered from 0 to n - 1. The tree structure is represented by the parent array, where parent[i] is the parent of node i. Since node 0 is the root, its parent is -1.

Each node also has a character assigned to it through the string s, where s[i] is the character for node i.

The problem performs a simultaneous restructuring operation on the tree. For every node x from 1 to n - 1, we search upward through its ancestors and find the closest ancestor y such that:

y is an ancestor of x
s[y] == s[x]

If such an ancestor exists, node x disconnects from its current parent and reconnects directly under y.

The important detail is that all changes happen simultaneously. This means every node decides its new parent using the original tree structure, not the partially modified tree.

After all parent changes are completed, we must compute the subtree size for every node in the final tree. The subtree size of a node includes the node itself and all descendants below it.

The constraints are important:

n can be as large as 10^5
The input is guaranteed to form a valid tree
Characters are lowercase English letters

These constraints immediately rule out expensive algorithms such as repeatedly traversing ancestors for every node in a naive way.

Several edge cases are especially important:

Nodes may not have any matching ancestor
Multiple nodes may simultaneously change parents
The root node never changes parent
Chains of repeated characters can create deep ancestor relationships
The final tree structure may differ significantly from the original tree

Because the operation is simultaneous, we must be careful not to update the tree incrementally while determining new parents.

Approaches

Brute Force Approach

A straightforward approach is to process every node independently.

For each node x, we repeatedly move upward through its ancestors using the parent array until we either:

find the closest ancestor with the same character, or
reach the root

Once we determine the new parent for every node, we construct the final tree and run a DFS to compute subtree sizes.

This approach is correct because it directly follows the problem definition. However, its performance is too slow in the worst case.

Consider a degenerate tree shaped like a linked list. For every node, we may need to traverse all ancestors. This leads to:

O(n) work per node
O(n^2) total complexity

With n = 10^5, this is infeasible.

Optimal Approach

The key observation is that while traversing the tree with DFS, we already know the current ancestor path.

Instead of searching upward repeatedly for every node, we maintain the most recent ancestor for each character during DFS traversal.

We use:

an adjacency list for the tree
a stack for each character

When visiting a node:

the top of the stack for its character represents the closest ancestor with the same character
we record that node as the new parent
we push the current node onto the stack
after processing children, we pop it back out

This transforms repeated ancestor searches into constant-time lookups.

After determining all final parents, we build the final tree and compute subtree sizes with another DFS.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n²)	O(n)	For each node, scan all ancestors upward
Optimal	O(n)	O(n)	DFS with character stacks enables constant-time ancestor lookup

Algorithm Walkthrough

Build the original tree using an adjacency list.

Since the input provides only parent references, we first convert it into a child-based adjacency list. This allows efficient DFS traversal from the root downward. 2. Create an array new_parent.

Initially, every node keeps its original parent. During DFS, we may update this value if we discover a closer matching ancestor. 3. Maintain stacks for characters.

We create a dictionary where each character maps to a stack of ancestor nodes currently active in the DFS path.

For example:

char_stack['a'] = [0, 3, 8]

means nodes 0, 3, and 8 are ancestors with character 'a', and node 8 is the closest one. 4. Perform DFS on the original tree.

For each node:

Check whether its character stack already contains an ancestor.
If yes, the top element is the closest matching ancestor.
Set that node as the new parent.
Push the current node onto its character stack.
Recursively process children.
Pop the node after finishing recursion.

This works because DFS naturally maintains the current ancestor path. 5. Build the final tree.

After determining all new parents, construct a new adjacency list representing the modified tree. 6. Compute subtree sizes.

Run another DFS on the final tree.

For each node:

Start with size 1
Add the subtree sizes of all children
Store the result

Return the subtree size array.

Why it works

During DFS, the character stack for a node's character always contains exactly the ancestors on the current root-to-node path with that character. Because DFS explores downward before backtracking, the top of the stack is always the closest matching ancestor.

Since every node determines its new parent using the original DFS traversal before any structural modifications are applied, the simultaneous-update requirement is satisfied.

The second DFS correctly computes subtree sizes because the final adjacency list exactly represents the modified tree.

Python Solution

from typing import List
from collections import defaultdict

class Solution:
    def findSubtreeSizes(self, parent: List[int], s: str) -> List[int]:
        n = len(parent)

        # Build original tree
        original_tree = [[] for _ in range(n)]

        for node in range(1, n):
            original_tree[parent[node]].append(node)

        # Store final parent after modifications
        new_parent = parent[:]

        # Character stacks
        char_stack = defaultdict(list)

        def dfs_find(node: int) -> None:
            char = s[node]

            # Closest ancestor with same character
            if char_stack[char]:
                new_parent[node] = char_stack[char][-1]

            # Add current node to stack
            char_stack[char].append(node)

            # Process children
            for child in original_tree[node]:
                dfs_find(child)

            # Backtrack
            char_stack[char].pop()

        # Root cannot change parent
        char_stack[s[0]].append(0)

        for child in original_tree[0]:
            dfs_find(child)

        # Build final tree
        final_tree = [[] for _ in range(n)]

        for node in range(1, n):
            final_tree[new_parent[node]].append(node)

        # Compute subtree sizes
        answer = [0] * n

        def dfs_size(node: int) -> int:
            size = 1

            for child in final_tree[node]:
                size += dfs_size(child)

            answer[node] = size
            return size

        dfs_size(0)

        return answer

The implementation begins by constructing the original tree as an adjacency list. This allows efficient DFS traversal from the root downward.

The new_parent array stores the final parent relationship after all simultaneous modifications. Initially, every node keeps its original parent.

The char_stack dictionary is the core optimization. Each character maintains a stack of currently active ancestors during DFS traversal. Because DFS always explores along a single root-to-current-node path, the stack top represents the nearest ancestor with that character.

The dfs_find function determines the final parent for every node. Before descending into children, the current node is pushed onto its character stack. After recursion finishes, the node is removed to restore the correct ancestor state during backtracking.

Once all final parents are known, the code builds the final tree structure and runs a second DFS to compute subtree sizes.

The dfs_size function uses standard postorder DFS. Each node contributes 1 for itself plus the sizes of all child subtrees.

Go Solution

package main

func findSubtreeSizes(parent []int, s string) []int {
	n := len(parent)

	// Build original tree
	originalTree := make([][]int, n)

	for node := 1; node < n; node++ {
		p := parent[node]
		originalTree[p] = append(originalTree[p], node)
	}

	// Store final parent
	newParent := make([]int, n)
	copy(newParent, parent)

	// Character stacks
	charStacks := make(map[byte][]int)

	var dfsFind func(int)

	dfsFind = func(node int) {
		ch := s[node]

		// Closest ancestor with same character
		if len(charStacks[ch]) > 0 {
			stack := charStacks[ch]
			newParent[node] = stack[len(stack)-1]
		}

		// Push current node
		charStacks[ch] = append(charStacks[ch], node)

		// Process children
		for _, child := range originalTree[node] {
			dfsFind(child)
		}

		// Pop current node
		stack := charStacks[ch]
		charStacks[ch] = stack[:len(stack)-1]
	}

	// Root cannot change parent
	charStacks[s[0]] = append(charStacks[s[0]], 0)

	for _, child := range originalTree[0] {
		dfsFind(child)
	}

	// Build final tree
	finalTree := make([][]int, n)

	for node := 1; node < n; node++ {
		p := newParent[node]
		finalTree[p] = append(finalTree[p], node)
	}

	// Compute subtree sizes
	answer := make([]int, n)

	var dfsSize func(int) int

	dfsSize = func(node int) int {
		size := 1

		for _, child := range finalTree[node] {
			size += dfsSize(child)
		}

		answer[node] = size
		return size
	}

	dfsSize(0)

	return answer
}

The Go implementation closely mirrors the Python solution. Since Go strings are byte-indexable for ASCII lowercase letters, characters are handled using byte.

The character stacks are implemented as map[byte][]int, where each slice behaves like a stack. Push operations use append, and pop operations use slice truncation.

Unlike Python lists, Go slices require explicit copying when duplicating the parent array, so copy(newParent, parent) is used.

No integer overflow concerns exist because subtree sizes are at most 10^5, which easily fits inside Go's int.

Worked Examples

Example 1

parent = [-1,0,0,1,1,1]
s = "abaabc"

Original Tree

0(a)
├── 1(b)
│   ├── 3(a)
│   ├── 4(b)
│   └── 5(c)
└── 2(a)

DFS Traversal

Current Node	Character	Stack Before	Closest Matching Ancestor	New Parent
1	b	[]	None	0
3	a	[0]	0	0
4	b	[1]	1	1
5	c	[]	None	1
2	a	[0]	0	0

Final Tree

0
├── 1
│   ├── 4
│   └── 5
├── 2
└── 3

Subtree Sizes

Node	Subtree Nodes	Size
0	[0,1,2,3,4,5]	6
1	[1,4,5]	3
2	[2]	1
3	[3]	1
4	[4]	1
5	[5]	1

Final answer:

[6,3,1,1,1,1]

Example 2

parent = [-1,0,4,0,1]
s = "abbba"

Original Tree

0(a)
├── 1(b)
│   └── 4(a)
│       └── 2(b)
└── 3(b)

DFS Traversal

Current Node	Character	Stack Before	Closest Matching Ancestor	New Parent
1	b	[]	None	0
4	a	[0]	0	0
2	b	[1]	1	1
3	b	[]	None	0

Final Tree

0
├── 1
│   └── 2
├── 3
└── 4

Subtree Sizes

Node	Subtree Nodes	Size
0	[0,1,2,3,4]	5
1	[1,2]	2
2	[2]	1
3	[3]	1
4	[4]	1

Final answer:

[5,2,1,1,1]

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Each node is visited a constant number of times
Space	O(n)	Adjacency lists, stacks, recursion, and result arrays

The algorithm performs two DFS traversals:

one to determine new parents
one to compute subtree sizes

Each node is pushed and popped from a character stack exactly once, so stack operations contribute only linear total work.

The adjacency lists together store exactly n - 1 edges, which is also linear space.

Test Cases

from typing import List
from collections import defaultdict

class Solution:
    def findSubtreeSizes(self, parent: List[int], s: str) -> List[int]:
        n = len(parent)

        original_tree = [[] for _ in range(n)]

        for node in range(1, n):
            original_tree[parent[node]].append(node)

        new_parent = parent[:]

        char_stack = defaultdict(list)

        def dfs_find(node: int) -> None:
            char = s[node]

            if char_stack[char]:
                new_parent[node] = char_stack[char][-1]

            char_stack[char].append(node)

            for child in original_tree[node]:
                dfs_find(child)

            char_stack[char].pop()

        char_stack[s[0]].append(0)

        for child in original_tree[0]:
            dfs_find(child)

        final_tree = [[] for _ in range(n)]

        for node in range(1, n):
            final_tree[new_parent[node]].append(node)

        answer = [0] * n

        def dfs_size(node: int) -> int:
            size = 1

            for child in final_tree[node]:
                size += dfs_size(child)

            answer[node] = size
            return size

        dfs_size(0)

        return answer

sol = Solution()

# Example 1
assert sol.findSubtreeSizes(
    [-1,0,0,1,1,1],
    "abaabc"
) == [6,3,1,1,1,1]  # basic restructuring

# Example 2
assert sol.findSubtreeSizes(
    [-1,0,4,0,1],
    "abbba"
) == [5,2,1,1,1]  # simultaneous updates

# Single node tree
assert sol.findSubtreeSizes(
    [-1],
    "a"
) == [1]  # minimum size input

# No matching ancestors
assert sol.findSubtreeSizes(
    [-1,0,1,2],
    "abcd"
) == [4,3,2,1]  # tree unchanged

# All characters identical
assert sol.findSubtreeSizes(
    [-1,0,1,2,3],
    "aaaaa"
) == [5,4,3,2,1]  # every node keeps closest parent

# Deep chain with restructuring
assert sol.findSubtreeSizes(
    [-1,0,1,2,3],
    "ababa"
) == [5,2,2,1,1]  # alternating ancestor matches

# Star shaped tree
assert sol.findSubtreeSizes(
    [-1,0,0,0,0],
    "abcda"
) == [5,1,1,1,1]  # direct root matching

# Multiple independent subtrees
assert sol.findSubtreeSizes(
    [-1,0,0,1,1,2,2],
    "abacaba"
) == [7,3,3,1,1,1,1]  # multiple character groups

print("All tests passed.")

Test	Why
Example 1	Verifies standard restructuring
Example 2	Verifies simultaneous parent updates
Single node tree	Validates minimum constraints
No matching ancestors	Ensures unchanged tree works correctly
All characters identical	Tests repeated ancestor matching
Deep chain with restructuring	Validates closest ancestor selection
Star shaped tree	Tests shallow tree structure
Multiple independent subtrees	Verifies correctness across branches

Edge Cases

Single Node Tree

A tree with only the root node is the smallest valid input. Since the root has no parent and no ancestors, no restructuring occurs.

This case can expose bugs involving empty adjacency lists or incorrect DFS initialization. The implementation handles it naturally because both DFS traversals simply process node 0 once.

No Matching Ancestors

If every node has a unique character compared to all ancestors, then no node changes parent.

A buggy implementation might accidentally assign incorrect parents or fail to preserve the original structure. This solution initializes new_parent as a copy of parent, ensuring nodes remain unchanged unless a valid matching ancestor is found.

Multiple Matching Ancestors

A node may have several ancestors with the same character. The problem specifically requires the closest one.

For example:

0(a)
|
1(a)
|
2(a)
|
3(a)

Node 3 must connect to node 2, not node 0.

This is exactly why the stack structure is important. The top of the stack always represents the nearest active ancestor with that character.

Simultaneous Updates

Parent changes must all happen simultaneously using the original tree.

A common mistake is modifying the tree immediately while processing nodes. That would incorrectly influence later ancestor searches.

This implementation avoids the issue entirely by:

first computing all new parents using the original tree
only afterward constructing the final tree

Therefore, every node makes its decision against the unchanged original structure.