LeetCode 2458 - Height of Binary Tree After Subtree Removal Queries
This problem asks us to answer multiple independent queries on a binary tree. For each query, we temporarily remove an entire subtree rooted at a specific node, then compute the height of the remaining tree.
Difficulty: 🔴 Hard
Topics: Array, Tree, Depth-First Search, Breadth-First Search, Binary Tree
Solution
Problem Understanding
This problem asks us to answer multiple independent queries on a binary tree. For each query, we temporarily remove an entire subtree rooted at a specific node, then compute the height of the remaining tree.
The binary tree contains n nodes, and every node has a unique value from 1 to n. The queries array contains node values. For each query value q, we remove the subtree rooted at node q, meaning that node and all of its descendants disappear from the tree.
After removing the subtree, we must determine the height of the remaining tree. The height is defined as the number of edges in the longest path from the root to any remaining node.
An important detail is that all queries are independent. After finishing one query, the tree returns to its original form before processing the next query. We are not permanently modifying the tree.
The constraints are large:
ncan be as large as100000queries.lengthcan be up to10000
These limits immediately tell us that recomputing the tree height from scratch for every query will be too slow. A naive solution that traverses the tree for each query could easily become O(n * m), which may reach 10^9 operations in the worst case.
Another important observation is that queries never remove the root itself. Therefore, the resulting tree is always non-empty.
There are several edge cases worth considering:
- Removing a leaf node should usually not affect the overall height unless that leaf lies on the unique deepest path.
- Removing an internal node may disconnect a very large subtree, dramatically reducing the height.
- The deepest path may shift to an entirely different branch after removal.
- Highly skewed trees behave differently from balanced trees because almost every node lies on the maximum-depth path.
The core challenge is efficiently determining the height of the remaining tree after excluding each possible subtree.
Approaches
Brute Force Approach
The most direct approach is to process each query independently.
For every query:
- Pretend the subtree rooted at the queried node does not exist.
- Run a DFS or BFS from the root.
- Compute the maximum depth among all remaining nodes.
- Return that depth as the resulting height.
This works because it explicitly reconstructs the answer for every query.
However, the performance is poor. Each query may require traversing almost the entire tree, which costs O(n). Since there can be up to 10^4 queries, the total complexity becomes O(n * m).
With n = 100000 and m = 10000, this approach is far too slow.
Key Insight for the Optimal Solution
Instead of recomputing the tree height after every removal, we can precompute enough information to answer every query in constant time.
The key observation is:
When removing a subtree rooted at node u, the remaining tree height must come from some path that does not enter u's subtree.
So for every node, we want to know:
What is the maximum height achievable without using this node's subtree?
This becomes a rerooting-style tree DP problem.
We compute two important values:
subtree_height[node]
- The height of the subtree rooted at that node.
- Computed with a postorder DFS.
answer[node]
- The height of the remaining tree after removing that node's subtree.
- Computed with a second DFS.
The second DFS propagates information from parent to child. When moving from a parent to one child, we calculate the best possible height that avoids that child's subtree. This can come from:
- Paths above the parent
- Paths through the sibling subtree
By carefully propagating these values, every query can be answered in O(1) time after preprocessing.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n * m) | O(n) | Recompute tree height from scratch for every query |
| Optimal | O(n + m) | O(n) | Precompute subtree heights and rerooted answers |
Algorithm Walkthrough
Step 1: Compute subtree heights
We first run a DFS from the root to compute:
subtree_height[node]
This represents the height of the subtree rooted at node.
For a node:
subtree_height[node] =
1 + max(left subtree height, right subtree height)
Since height is measured in edges, we use -1 for null nodes so that leaf nodes correctly get height 0.
This DFS is postorder because we must know the children's heights before computing the parent's height.
Step 2: Define what answer[node] means
For every node:
answer[node]
stores the height of the tree after removing that node's subtree.
This is exactly what the queries ask for.
Step 3: Run a second DFS to propagate external heights
Now we perform another DFS from the root.
For every node, we pass down:
best_without_current_subtree
This value represents the best possible tree height obtainable without entering the current node's subtree.
When processing a child, two possible sources contribute to the new height:
- A path coming from ancestors
- A path going through the sibling subtree
Suppose we are at parent p with depth d.
If we move into the left child, then the best alternative path may come from:
- Anything already available above
p - The right subtree of
p
The candidate height using the sibling subtree becomes:
depth(parent) + 1 + subtree_height[sibling]
We take the maximum of all available alternatives and pass it to the child.
Step 4: Store answers
When visiting node u, the propagated value already represents the height after removing u's subtree.
So:
answer[u] = propagated_value
Step 5: Answer queries
After preprocessing, every query becomes:
return answer[q]
This gives constant-time query resolution.
Why it works
The algorithm works because every root-to-node path either:
- Enters the removed subtree, or
- Avoids it completely
When removing a subtree rooted at node u, every valid remaining path must avoid u.
The second DFS correctly propagates the best possible path length that excludes each subtree by considering all alternative routes:
- Through ancestors
- Through sibling subtrees
Thus, answer[u] always equals the height of the remaining tree after removing u's subtree.
Python Solution
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
from typing import Optional, List
from collections import defaultdict
class Solution:
def treeQueries(
self,
root: Optional[TreeNode],
queries: List[int]
) -> List[int]:
subtree_height = {}
answer = {}
def compute_heights(node: Optional[TreeNode]) -> int:
if not node:
return -1
left_height = compute_heights(node.left)
right_height = compute_heights(node.right)
subtree_height[node.val] = 1 + max(left_height, right_height)
return subtree_height[node.val]
compute_heights(root)
def dfs(
node: Optional[TreeNode],
depth: int,
best_without_subtree: int
) -> None:
if not node:
return
answer[node.val] = best_without_subtree
left_subtree_height = (
subtree_height[node.left.val]
if node.left else -1
)
right_subtree_height = (
subtree_height[node.right.val]
if node.right else -1
)
if node.left:
candidate = depth + 1 + right_subtree_height
new_best = max(best_without_subtree, candidate)
dfs(node.left, depth + 1, new_best)
if node.right:
candidate = depth + 1 + left_subtree_height
new_best = max(best_without_subtree, candidate)
dfs(node.right, depth + 1, new_best)
dfs(root, 0, 0)
return [answer[q] for q in queries]
The implementation is divided into two DFS traversals.
The first DFS computes the height of every subtree. Since subtree height depends on children's heights, this traversal is naturally postorder. Null nodes return -1, allowing leaf nodes to receive height 0.
The subtree_height dictionary stores the computed height for every node value.
The second DFS performs the rerooting process. The parameter best_without_subtree represents the maximum tree height obtainable without entering the current node's subtree.
At each node, we immediately store this value into answer[node.val], because this is exactly the height remaining after removing that subtree.
When descending into a child, we compute an alternative path using the sibling subtree. For example, when exploring the left child, the longest alternative path may travel through the right subtree instead.
The maximum of:
- previously available heights
- sibling-based heights
is propagated downward.
Finally, queries are answered in constant time by directly looking up precomputed results.
Go Solution
/**
* Definition for a binary tree node.
* type TreeNode struct {
* Val int
* Left *TreeNode
* Right *TreeNode
* }
*/
func treeQueries(root *TreeNode, queries []int) []int {
subtreeHeight := make(map[int]int)
answer := make(map[int]int)
var computeHeights func(node *TreeNode) int
computeHeights = func(node *TreeNode) int {
if node == nil {
return -1
}
leftHeight := computeHeights(node.Left)
rightHeight := computeHeights(node.Right)
if leftHeight > rightHeight {
subtreeHeight[node.Val] = leftHeight + 1
} else {
subtreeHeight[node.Val] = rightHeight + 1
}
return subtreeHeight[node.Val]
}
computeHeights(root)
var dfs func(node *TreeNode, depth int, bestWithoutSubtree int)
dfs = func(node *TreeNode, depth int, bestWithoutSubtree int) {
if node == nil {
return
}
answer[node.Val] = bestWithoutSubtree
leftSubtreeHeight := -1
rightSubtreeHeight := -1
if node.Left != nil {
leftSubtreeHeight = subtreeHeight[node.Left.Val]
}
if node.Right != nil {
rightSubtreeHeight = subtreeHeight[node.Right.Val]
}
if node.Left != nil {
candidate := depth + 1 + rightSubtreeHeight
newBest := bestWithoutSubtree
if candidate > newBest {
newBest = candidate
}
dfs(node.Left, depth+1, newBest)
}
if node.Right != nil {
candidate := depth + 1 + leftSubtreeHeight
newBest := bestWithoutSubtree
if candidate > newBest {
newBest = candidate
}
dfs(node.Right, depth+1, newBest)
}
}
dfs(root, 0, 0)
result := make([]int, len(queries))
for i, q := range queries {
result[i] = answer[q]
}
return result
}
The Go implementation closely mirrors the Python solution.
Maps are used instead of dictionaries for storing subtree heights and answers. Since Go does not support nested function declarations as naturally as Python, function variables are declared first and assigned afterward.
Nil checks are required before accessing child pointers. Integer overflow is not a concern because tree heights are bounded by n, which is at most 100000.
Slices are used for the final query results.
Worked Examples
Example 1
root = [1,3,4,2,null,6,5,null,null,null,null,null,7]
queries = [4]
The tree structure is:
1
/ \
3 4
/ / \
2 6 5
\
7
Step 1: Compute subtree heights
| Node | Subtree Height |
|---|---|
| 2 | 0 |
| 3 | 1 |
| 6 | 0 |
| 7 | 0 |
| 5 | 1 |
| 4 | 2 |
| 1 | 3 |
Step 2: Compute propagated answers
| Node | Height After Removing Subtree |
|---|---|
| 1 | 0 |
| 3 | 3 |
| 2 | 3 |
| 4 | 2 |
| 6 | 2 |
| 5 | 2 |
| 7 | 2 |
Query Processing
For query 4:
answer[4] = 2
Remaining longest path:
1 -> 3 -> 2
Height = 2.
Example 2
root = [5,8,9,2,1,3,7,4,6]
queries = [3,2,4,8]
Tree:
5
/ \
8 9
/ \ / \
2 1 3 7
/ \
4 6
Subtree heights
| Node | Height |
|---|---|
| 4 | 0 |
| 6 | 0 |
| 2 | 1 |
| 1 | 0 |
| 8 | 2 |
| 3 | 0 |
| 7 | 0 |
| 9 | 1 |
| 5 | 3 |
Final answers
| Query | Result |
|---|---|
| 3 | 3 |
| 2 | 2 |
| 4 | 3 |
| 8 | 2 |
For query 2, removing subtree 2 removes both 4 and 6, leaving:
5 -> 8 -> 1
Height becomes 2.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + m) | Two DFS traversals plus constant-time query lookups |
| Space | O(n) | Hash maps and recursion stack |
The preprocessing stage visits every node exactly twice, once for subtree heights and once for rerooting propagation. Each visit performs constant work.
After preprocessing, each query is answered in constant time using direct dictionary or map lookup.
The recursion depth may reach O(n) in a highly skewed tree.
Test Cases
# Definition for testing
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
# Example 1
root1 = TreeNode(1)
root1.left = TreeNode(3)
root1.right = TreeNode(4)
root1.left.left = TreeNode(2)
root1.right.left = TreeNode(6)
root1.right.right = TreeNode(5)
root1.right.right.right = TreeNode(7)
assert Solution().treeQueries(root1, [4]) == [2] # provided example
# Example 2
root2 = TreeNode(5)
root2.left = TreeNode(8)
root2.right = TreeNode(9)
root2.left.left = TreeNode(2)
root2.left.right = TreeNode(1)
root2.right.left = TreeNode(3)
root2.right.right = TreeNode(7)
root2.left.left.left = TreeNode(4)
root2.left.left.right = TreeNode(6)
assert Solution().treeQueries(root2, [3, 2, 4, 8]) == [3, 2, 3, 2] # provided example
# Small balanced tree
root3 = TreeNode(1)
root3.left = TreeNode(2)
root3.right = TreeNode(3)
assert Solution().treeQueries(root3, [2, 3]) == [1, 1] # removing either leaf
# Skewed tree
root4 = TreeNode(1)
root4.right = TreeNode(2)
root4.right.right = TreeNode(3)
root4.right.right.right = TreeNode(4)
assert Solution().treeQueries(root4, [4, 3, 2]) == [2, 1, 0] # deep chain removals
# Removing subtree that contains deepest path
root5 = TreeNode(1)
root5.left = TreeNode(2)
root5.right = TreeNode(3)
root5.left.left = TreeNode(4)
assert Solution().treeQueries(root5, [2]) == [1] # deepest branch removed
# Removing leaf that does not affect answer
root6 = TreeNode(1)
root6.left = TreeNode(2)
root6.right = TreeNode(3)
root6.right.right = TreeNode(4)
assert Solution().treeQueries(root6, [2]) == [2] # other branch still deepest
# Single query on large branch
root7 = TreeNode(1)
root7.left = TreeNode(2)
root7.left.left = TreeNode(3)
root7.right = TreeNode(4)
assert Solution().treeQueries(root7, [3]) == [1] # remove deepest leaf
| Test | Why |
|---|---|
| Example 1 | Validates correctness against official example |
| Example 2 | Tests multiple independent queries |
| Small balanced tree | Confirms leaf removals work correctly |
| Skewed tree | Tests worst-case recursion structure |
| Removing deepest branch | Ensures height reduction is handled |
| Removing non-critical leaf | Verifies unaffected heights remain correct |
| Deep leaf removal | Tests subtree removal at maximum depth |
Edge Cases
Removing a Leaf Node
A leaf node has no children, so removing its subtree removes only that single node. This case can be tricky because the tree height may or may not change depending on whether the leaf lies on the deepest path.
The implementation handles this naturally because rerooting propagation always tracks the best alternative height outside the removed subtree.
Highly Skewed Trees
A completely skewed tree behaves like a linked list. In this structure, removing nodes near the root drastically changes the height.
This case is important because recursion depth becomes large. The algorithm still works correctly because every node stores the best remaining height excluding its subtree.
Removing a Subtree Containing the Deepest Path
Sometimes the deepest path in the tree passes entirely through the removed subtree. In that situation, the new answer must come from another branch.
The rerooting DFS explicitly considers sibling subtrees and ancestor paths, ensuring the algorithm correctly switches to the next-best remaining path.
Trees with Only One Remaining Branch
After removing a subtree, the remaining tree may contain only a single chain from the root.
The algorithm correctly computes the resulting height because all propagated values represent valid root-to-node path lengths that avoid the removed subtree.