LeetCode 2458 - Height of Binary Tree After Subtree Removal Queries

Difficulty: 🔴 Hard
Topics: Array, Tree, Depth-First Search, Breadth-First Search, Binary Tree

Solution

Problem Understanding

This problem asks us to answer multiple independent queries on a binary tree. For each query, we temporarily remove an entire subtree rooted at a specific node, then compute the height of the remaining tree.

The binary tree contains n nodes, and every node has a unique value from 1 to n. The queries array contains node values. For each query value q, we remove the subtree rooted at node q, meaning that node and all of its descendants disappear from the tree.

After removing the subtree, we must determine the height of the remaining tree. The height is defined as the number of edges in the longest path from the root to any remaining node.

An important detail is that all queries are independent. After finishing one query, the tree returns to its original form before processing the next query. We are not permanently modifying the tree.

The constraints are large:

n can be as large as 100000
queries.length can be up to 10000

These limits immediately tell us that recomputing the tree height from scratch for every query will be too slow. A naive solution that traverses the tree for each query could easily become O(n * m), which may reach 10^9 operations in the worst case.

Another important observation is that queries never remove the root itself. Therefore, the resulting tree is always non-empty.

There are several edge cases worth considering:

Removing a leaf node should usually not affect the overall height unless that leaf lies on the unique deepest path.
Removing an internal node may disconnect a very large subtree, dramatically reducing the height.
The deepest path may shift to an entirely different branch after removal.
Highly skewed trees behave differently from balanced trees because almost every node lies on the maximum-depth path.

The core challenge is efficiently determining the height of the remaining tree after excluding each possible subtree.

Approaches

Brute Force Approach

The most direct approach is to process each query independently.

For every query:

Pretend the subtree rooted at the queried node does not exist.
Run a DFS or BFS from the root.
Compute the maximum depth among all remaining nodes.
Return that depth as the resulting height.

This works because it explicitly reconstructs the answer for every query.

However, the performance is poor. Each query may require traversing almost the entire tree, which costs O(n). Since there can be up to 10^4 queries, the total complexity becomes O(n * m).

With n = 100000 and m = 10000, this approach is far too slow.

Key Insight for the Optimal Solution

Instead of recomputing the tree height after every removal, we can precompute enough information to answer every query in constant time.

The key observation is:

When removing a subtree rooted at node u, the remaining tree height must come from some path that does not enter u's subtree.

So for every node, we want to know:

What is the maximum height achievable without using this node's subtree?

This becomes a rerooting-style tree DP problem.

We compute two important values:

subtree_height[node]

The height of the subtree rooted at that node.
Computed with a postorder DFS.

answer[node]

The height of the remaining tree after removing that node's subtree.
Computed with a second DFS.

The second DFS propagates information from parent to child. When moving from a parent to one child, we calculate the best possible height that avoids that child's subtree. This can come from:

Paths above the parent
Paths through the sibling subtree

By carefully propagating these values, every query can be answered in O(1) time after preprocessing.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n * m)	O(n)	Recompute tree height from scratch for every query
Optimal	O(n + m)	O(n)	Precompute subtree heights and rerooted answers

Algorithm Walkthrough

Step 1: Compute subtree heights

We first run a DFS from the root to compute:

subtree_height[node]

This represents the height of the subtree rooted at node.

For a node:

subtree_height[node] =
1 + max(left subtree height, right subtree height)

Since height is measured in edges, we use -1 for null nodes so that leaf nodes correctly get height 0.

This DFS is postorder because we must know the children's heights before computing the parent's height.

Step 2: Define what answer[node] means

For every node:

answer[node]

stores the height of the tree after removing that node's subtree.

This is exactly what the queries ask for.

Step 3: Run a second DFS to propagate external heights

Now we perform another DFS from the root.

For every node, we pass down:

best_without_current_subtree

This value represents the best possible tree height obtainable without entering the current node's subtree.

When processing a child, two possible sources contribute to the new height:

A path coming from ancestors
A path going through the sibling subtree

Suppose we are at parent p with depth d.

If we move into the left child, then the best alternative path may come from:

Anything already available above p
The right subtree of p

The candidate height using the sibling subtree becomes:

depth(parent) + 1 + subtree_height[sibling]

We take the maximum of all available alternatives and pass it to the child.

Step 4: Store answers

When visiting node u, the propagated value already represents the height after removing u's subtree.

So:

answer[u] = propagated_value

Step 5: Answer queries

After preprocessing, every query becomes:

return answer[q]

This gives constant-time query resolution.

Why it works

The algorithm works because every root-to-node path either:

Enters the removed subtree, or
Avoids it completely

When removing a subtree rooted at node u, every valid remaining path must avoid u.

The second DFS correctly propagates the best possible path length that excludes each subtree by considering all alternative routes:

Through ancestors
Through sibling subtrees

Thus, answer[u] always equals the height of the remaining tree after removing u's subtree.

Python Solution

# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, val=0, left=None, right=None):
#         self.val = val
#         self.left = left
#         self.right = right

from typing import Optional, List
from collections import defaultdict

class Solution:
    def treeQueries(
        self,
        root: Optional[TreeNode],
        queries: List[int]
    ) -> List[int]:

        subtree_height = {}
        answer = {}

        def compute_heights(node: Optional[TreeNode]) -> int:
            if not node:
                return -1

            left_height = compute_heights(node.left)
            right_height = compute_heights(node.right)

            subtree_height[node.val] = 1 + max(left_height, right_height)

            return subtree_height[node.val]

        compute_heights(root)

        def dfs(
            node: Optional[TreeNode],
            depth: int,
            best_without_subtree: int
        ) -> None:

            if not node:
                return

            answer[node.val] = best_without_subtree

            left_subtree_height = (
                subtree_height[node.left.val]
                if node.left else -1
            )

            right_subtree_height = (
                subtree_height[node.right.val]
                if node.right else -1
            )

            if node.left:
                candidate = depth + 1 + right_subtree_height
                new_best = max(best_without_subtree, candidate)

                dfs(node.left, depth + 1, new_best)

            if node.right:
                candidate = depth + 1 + left_subtree_height
                new_best = max(best_without_subtree, candidate)

                dfs(node.right, depth + 1, new_best)

        dfs(root, 0, 0)

        return [answer[q] for q in queries]

The implementation is divided into two DFS traversals.

The first DFS computes the height of every subtree. Since subtree height depends on children's heights, this traversal is naturally postorder. Null nodes return -1, allowing leaf nodes to receive height 0.

The subtree_height dictionary stores the computed height for every node value.

The second DFS performs the rerooting process. The parameter best_without_subtree represents the maximum tree height obtainable without entering the current node's subtree.

At each node, we immediately store this value into answer[node.val], because this is exactly the height remaining after removing that subtree.

When descending into a child, we compute an alternative path using the sibling subtree. For example, when exploring the left child, the longest alternative path may travel through the right subtree instead.

The maximum of:

previously available heights
sibling-based heights

is propagated downward.

Finally, queries are answered in constant time by directly looking up precomputed results.

Go Solution

/**
 * Definition for a binary tree node.
 * type TreeNode struct {
 *     Val int
 *     Left *TreeNode
 *     Right *TreeNode
 * }
 */
func treeQueries(root *TreeNode, queries []int) []int {
    subtreeHeight := make(map[int]int)
    answer := make(map[int]int)

    var computeHeights func(node *TreeNode) int

    computeHeights = func(node *TreeNode) int {
        if node == nil {
            return -1
        }

        leftHeight := computeHeights(node.Left)
        rightHeight := computeHeights(node.Right)

        if leftHeight > rightHeight {
            subtreeHeight[node.Val] = leftHeight + 1
        } else {
            subtreeHeight[node.Val] = rightHeight + 1
        }

        return subtreeHeight[node.Val]
    }

    computeHeights(root)

    var dfs func(node *TreeNode, depth int, bestWithoutSubtree int)

    dfs = func(node *TreeNode, depth int, bestWithoutSubtree int) {
        if node == nil {
            return
        }

        answer[node.Val] = bestWithoutSubtree

        leftSubtreeHeight := -1
        rightSubtreeHeight := -1

        if node.Left != nil {
            leftSubtreeHeight = subtreeHeight[node.Left.Val]
        }

        if node.Right != nil {
            rightSubtreeHeight = subtreeHeight[node.Right.Val]
        }

        if node.Left != nil {
            candidate := depth + 1 + rightSubtreeHeight

            newBest := bestWithoutSubtree
            if candidate > newBest {
                newBest = candidate
            }

            dfs(node.Left, depth+1, newBest)
        }

        if node.Right != nil {
            candidate := depth + 1 + leftSubtreeHeight

            newBest := bestWithoutSubtree
            if candidate > newBest {
                newBest = candidate
            }

            dfs(node.Right, depth+1, newBest)
        }
    }

    dfs(root, 0, 0)

    result := make([]int, len(queries))

    for i, q := range queries {
        result[i] = answer[q]
    }

    return result
}

The Go implementation closely mirrors the Python solution.

Maps are used instead of dictionaries for storing subtree heights and answers. Since Go does not support nested function declarations as naturally as Python, function variables are declared first and assigned afterward.

Nil checks are required before accessing child pointers. Integer overflow is not a concern because tree heights are bounded by n, which is at most 100000.

Slices are used for the final query results.

Worked Examples

Example 1

root = [1,3,4,2,null,6,5,null,null,null,null,null,7]
queries = [4]

The tree structure is:

        1
      /   \
     3     4
    /     / \
   2     6   5
                \
                 7

Step 1: Compute subtree heights

Node	Subtree Height
2	0
3	1
6	0
7	0
5	1
4	2
1	3

Step 2: Compute propagated answers

Node	Height After Removing Subtree
1	0
3	3
2	3
4	2
6	2
5	2
7	2

Query Processing

For query 4:

answer[4] = 2

Remaining longest path:

1 -> 3 -> 2

Height = 2.

Example 2

root = [5,8,9,2,1,3,7,4,6]
queries = [3,2,4,8]

Tree:

           5
         /   \
        8     9
       / \   / \
      2   1 3   7
     / \
    4   6

Subtree heights

Node	Height
4	0
6	0
2	1
1	0
8	2
3	0
7	0
9	1
5	3

Final answers

Query	Result
3	3
2	2
4	3
8	2

For query 2, removing subtree 2 removes both 4 and 6, leaving:

5 -> 8 -> 1

Height becomes 2.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n + m)	Two DFS traversals plus constant-time query lookups
Space	O(n)	Hash maps and recursion stack

The preprocessing stage visits every node exactly twice, once for subtree heights and once for rerooting propagation. Each visit performs constant work.

After preprocessing, each query is answered in constant time using direct dictionary or map lookup.

The recursion depth may reach O(n) in a highly skewed tree.

Test Cases

# Definition for testing
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

# Example 1
root1 = TreeNode(1)
root1.left = TreeNode(3)
root1.right = TreeNode(4)
root1.left.left = TreeNode(2)
root1.right.left = TreeNode(6)
root1.right.right = TreeNode(5)
root1.right.right.right = TreeNode(7)

assert Solution().treeQueries(root1, [4]) == [2]  # provided example

# Example 2
root2 = TreeNode(5)
root2.left = TreeNode(8)
root2.right = TreeNode(9)

root2.left.left = TreeNode(2)
root2.left.right = TreeNode(1)

root2.right.left = TreeNode(3)
root2.right.right = TreeNode(7)

root2.left.left.left = TreeNode(4)
root2.left.left.right = TreeNode(6)

assert Solution().treeQueries(root2, [3, 2, 4, 8]) == [3, 2, 3, 2]  # provided example

# Small balanced tree
root3 = TreeNode(1)
root3.left = TreeNode(2)
root3.right = TreeNode(3)

assert Solution().treeQueries(root3, [2, 3]) == [1, 1]  # removing either leaf

# Skewed tree
root4 = TreeNode(1)
root4.right = TreeNode(2)
root4.right.right = TreeNode(3)
root4.right.right.right = TreeNode(4)

assert Solution().treeQueries(root4, [4, 3, 2]) == [2, 1, 0]  # deep chain removals

# Removing subtree that contains deepest path
root5 = TreeNode(1)
root5.left = TreeNode(2)
root5.right = TreeNode(3)
root5.left.left = TreeNode(4)

assert Solution().treeQueries(root5, [2]) == [1]  # deepest branch removed

# Removing leaf that does not affect answer
root6 = TreeNode(1)
root6.left = TreeNode(2)
root6.right = TreeNode(3)
root6.right.right = TreeNode(4)

assert Solution().treeQueries(root6, [2]) == [2]  # other branch still deepest

# Single query on large branch
root7 = TreeNode(1)
root7.left = TreeNode(2)
root7.left.left = TreeNode(3)
root7.right = TreeNode(4)

assert Solution().treeQueries(root7, [3]) == [1]  # remove deepest leaf

Test	Why
Example 1	Validates correctness against official example
Example 2	Tests multiple independent queries
Small balanced tree	Confirms leaf removals work correctly
Skewed tree	Tests worst-case recursion structure
Removing deepest branch	Ensures height reduction is handled
Removing non-critical leaf	Verifies unaffected heights remain correct
Deep leaf removal	Tests subtree removal at maximum depth

Edge Cases

Removing a Leaf Node

A leaf node has no children, so removing its subtree removes only that single node. This case can be tricky because the tree height may or may not change depending on whether the leaf lies on the deepest path.

The implementation handles this naturally because rerooting propagation always tracks the best alternative height outside the removed subtree.

Highly Skewed Trees

A completely skewed tree behaves like a linked list. In this structure, removing nodes near the root drastically changes the height.

This case is important because recursion depth becomes large. The algorithm still works correctly because every node stores the best remaining height excluding its subtree.

Removing a Subtree Containing the Deepest Path

Sometimes the deepest path in the tree passes entirely through the removed subtree. In that situation, the new answer must come from another branch.

The rerooting DFS explicitly considers sibling subtrees and ancestor paths, ensuring the algorithm correctly switches to the next-best remaining path.

Trees with Only One Remaining Branch

After removing a subtree, the remaining tree may contain only a single chain from the root.

The algorithm correctly computes the resulting height because all propagated values represent valid root-to-node path lengths that avoid the removed subtree.