LeetCode 333 - Largest BST Subtree

The problem gives us the root of an arbitrary binary tree and asks us to find the size of the largest subtree that is also a valid Binary Search Tree, abbreviated as BST. A subtree is any node together with all of its descendants. This means we cannot selectively ignore children.

LeetCode Problem 333

Difficulty: 🟡 Medium
Topics: Dynamic Programming, Tree, Depth-First Search, Binary Search Tree, Binary Tree

Solution

Problem Understanding

The problem gives us the root of an arbitrary binary tree and asks us to find the size of the largest subtree that is also a valid Binary Search Tree, abbreviated as BST.

A subtree is any node together with all of its descendants. This means we cannot selectively ignore children. If we choose a node as the root of a subtree, we must include every descendant underneath it.

A Binary Search Tree follows two important rules:

  1. Every value in the left subtree must be strictly smaller than the current node's value.
  2. Every value in the right subtree must be strictly greater than the current node's value.

The tree provided in the input is not guaranteed to be a BST. Some parts of the tree may satisfy BST properties while others do not. Our goal is to identify the largest connected subtree that forms a valid BST and return the number of nodes in that subtree.

The input is a standard binary tree root pointer. The output is a single integer representing the maximum number of nodes among all BST subtrees.

The constraints are important:

  • The tree may contain up to 10^4 nodes.
  • Node values range from -10^4 to 10^4.

A tree with ten thousand nodes is large enough that repeated full traversals become expensive. An O(n^2) solution could potentially time out in the worst case, especially for skewed trees. This strongly suggests that the intended solution should process each node only once, leading to an O(n) approach.

Several edge cases are important:

  • An empty tree should return 0.
  • A single node is always a valid BST of size 1.
  • A tree where no parent-child relationships satisfy BST rules still contains valid single-node BSTs.
  • Duplicate values matter because BST rules require strictly smaller and strictly greater values. Equal values invalidate the BST condition.
  • Highly skewed trees can expose inefficiencies in recursive validation approaches.

Approaches

Brute Force Approach

The most straightforward solution is to consider every node as a possible subtree root and check whether that subtree forms a valid BST.

For each node:

  1. Validate whether the subtree rooted at that node is a BST.
  2. If it is, count the number of nodes in that subtree.
  3. Track the maximum subtree size found.

To validate a BST correctly, we must ensure that every node satisfies a valid value range inherited from its ancestors. This requires a DFS traversal with lower and upper bounds.

The issue is that we repeatedly traverse the same nodes many times. For example, when checking the subtree rooted at a node, we traverse its descendants. Then when checking the child subtree, we traverse many of those same descendants again.

In the worst case, especially for skewed trees, this leads to O(n^2) time complexity.

Optimal Approach

The key insight is that BST validity can be determined bottom-up.

For a node to be the root of a valid BST:

  • Its left subtree must already be a BST.
  • Its right subtree must already be a BST.
  • The maximum value in the left subtree must be smaller than the current node.
  • The minimum value in the right subtree must be greater than the current node.

This suggests a postorder traversal because we need information from children before processing the parent.

For every subtree, we compute four pieces of information:

  • Whether the subtree is a BST.
  • The size of the subtree if it is a BST.
  • The minimum value inside the subtree.
  • The maximum value inside the subtree.

Using this information, each parent can determine in constant time whether its entire subtree forms a BST.

Since every node is processed exactly once, the solution runs in O(n) time.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(h) Repeatedly validates overlapping subtrees
Optimal O(n) O(h) Single postorder traversal with subtree metadata

Here, h represents the height of the tree.

Algorithm Walkthrough

Optimal Bottom-Up DFS

  1. Perform a postorder traversal of the tree.

We process the left subtree first, then the right subtree, and finally the current node. This order is necessary because the current node depends on information from both children. 2. Define the information returned for each subtree.

For every node, the DFS function returns:

  • Whether the subtree is a valid BST.
  • The size of the subtree if it is a BST.
  • The minimum value in the subtree.
  • The maximum value in the subtree.

This information is sufficient for the parent to validate BST conditions. 3. Handle the base case for null nodes.

An empty subtree is considered a valid BST.

We return:

  • is_bst = True
  • size = 0
  • min_value = +infinity
  • max_value = -infinity

These extreme values make comparisons work naturally for leaf nodes. 4. Recursively process the left and right children.

Each recursive call gives us complete information about the child subtree. 5. Check whether the current node forms a BST.

The current subtree is a BST only if:

  • Left subtree is a BST.
  • Right subtree is a BST.
  • left_max < node.val < right_min

These conditions guarantee BST correctness for the entire subtree. 6. Compute the current subtree information.

If the subtree is valid:

  • Size becomes left_size + right_size + 1
  • Minimum becomes min(left_min, node.val)
  • Maximum becomes max(right_max, node.val)

We also update the global answer with the new subtree size. 7. Handle invalid BST cases.

If the subtree is not a BST:

  • Mark it invalid.
  • The exact min and max values no longer matter because ancestors cannot use this subtree as part of a BST.
  1. Continue upward until the root is processed.

The global maximum size tracked during traversal is the final answer.

Why it works

The algorithm maintains a strong invariant:

For every node, the DFS function correctly reports whether the subtree rooted at that node is a BST, along with its size and value boundaries.

Because each parent receives fully validated information from its children, it can correctly determine whether combining them with the current node still satisfies BST properties. Since the traversal processes nodes bottom-up, every subtree is evaluated exactly once and all BST conditions are checked correctly.

Python Solution

# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, val=0, left=None, right=None):
#         self.val = val
#         self.left = left
#         self.right = right

from typing import Optional, Tuple

class Solution:
    def largestBSTSubtree(self, root: Optional[TreeNode]) -> int:
        largest_size = 0

        def dfs(node: Optional[TreeNode]) -> Tuple[bool, int, int, int]:
            nonlocal largest_size

            if not node:
                return True, 0, float("inf"), float("-inf")

            left_is_bst, left_size, left_min, left_max = dfs(node.left)
            right_is_bst, right_size, right_min, right_max = dfs(node.right)

            if (
                left_is_bst
                and right_is_bst
                and left_max < node.val < right_min
            ):
                current_size = left_size + right_size + 1
                largest_size = max(largest_size, current_size)

                current_min = min(left_min, node.val)
                current_max = max(right_max, node.val)

                return True, current_size, current_min, current_max

            return False, 0, 0, 0

        dfs(root)
        return largest_size

The implementation uses a nested DFS function to process subtrees recursively.

The DFS function returns four values describing the subtree rooted at the current node. These values allow the parent node to determine whether combining both children with the current node still forms a valid BST.

The base case handles null nodes by treating them as valid BSTs of size zero. Using positive and negative infinity for the min and max values simplifies comparisons for leaf nodes.

After recursively processing both children, the algorithm checks the BST conditions:

  • Left subtree must be valid.
  • Right subtree must be valid.
  • Maximum value in the left subtree must be smaller than the current node.
  • Minimum value in the right subtree must be greater than the current node.

If all conditions hold, the subtree is valid and we compute its size and boundary values. Otherwise, the subtree is marked invalid.

The variable largest_size tracks the largest BST discovered anywhere in the tree.

Go Solution

/**
 * Definition for a binary tree node.
 * type TreeNode struct {
 *     Val int
 *     Left *TreeNode
 *     Right *TreeNode
 * }
 */

import "math"

type Info struct {
	isBST bool
	size  int
	minV  int
	maxV  int
}

func largestBSTSubtree(root *TreeNode) int {
	largest := 0

	var dfs func(node *TreeNode) Info

	dfs = func(node *TreeNode) Info {
		if node == nil {
			return Info{
				isBST: true,
				size:  0,
				minV:  math.MaxInt,
				maxV:  math.MinInt,
			}
		}

		left := dfs(node.Left)
		right := dfs(node.Right)

		if left.isBST &&
			right.isBST &&
			left.maxV < node.Val &&
			node.Val < right.minV {

			currentSize := left.size + right.size + 1

			if currentSize > largest {
				largest = currentSize
			}

			currentMin := min(left.minV, node.Val)
			currentMax := max(right.maxV, node.Val)

			return Info{
				isBST: true,
				size:  currentSize,
				minV:  currentMin,
				maxV:  currentMax,
			}
		}

		return Info{
			isBST: false,
			size:  0,
			minV:  0,
			maxV:  0,
		}
	}

	dfs(root)

	return largest
}

func min(a, b int) int {
	if a < b {
		return a
	}
	return b
}

func max(a, b int) int {
	if a > b {
		return a
	}
	return b
}

The Go implementation follows the same logic as the Python version but uses a struct named Info to package subtree metadata.

Go does not have tuples, so the struct stores:

  • BST validity
  • Subtree size
  • Minimum value
  • Maximum value

The recursion uses nil checks instead of Python's truthiness checks.

The implementation uses math.MaxInt and math.MinInt as sentinel values for empty subtrees.

Worked Examples

Example 1

Input:

        10
       /  \
      5    15
     / \     \
    1   8     7

We process the tree bottom-up.

Node Left Result Right Result BST Valid? Subtree Size Largest
1 valid empty valid empty Yes 1 1
8 valid empty valid empty Yes 1 1
5 max(left)=1, min(right)=8 both valid Yes 3 3
7 valid empty valid empty Yes 1 3
15 right min=7 15 < 7 fails No 0 3
10 right subtree invalid fails No 0 3

Final answer: 3

The subtree rooted at 5 is the largest valid BST.

Example 2

Input:

         4
       /   \
      2     7
     / \   /
    2   3 5
   /
  2
 /
1

Processing bottom-up:

Node BST Valid? Size
1 Yes 1
2 (bottom) Yes 2
2 (upper) No, duplicate value on left 0
3 Yes 1
5 Yes 1
7 Yes 2
4 Left subtree invalid 0

Largest BST size is 2.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Every node is processed exactly once
Space O(h) Recursive call stack uses tree height

The algorithm visits each node a single time and performs constant work at each visit, giving linear time complexity.

The auxiliary space comes from recursion depth. In a balanced tree, the height is O(log n). In the worst case, such as a completely skewed tree, the height becomes O(n).

Test Cases

# Definition for testing
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

# Solution implementation copied for testing
from typing import Optional, Tuple

class Solution:
    def largestBSTSubtree(self, root: Optional[TreeNode]) -> int:
        largest_size = 0

        def dfs(node):
            nonlocal largest_size

            if not node:
                return True, 0, float("inf"), float("-inf")

            left_is_bst, left_size, left_min, left_max = dfs(node.left)
            right_is_bst, right_size, right_min, right_max = dfs(node.right)

            if (
                left_is_bst
                and right_is_bst
                and left_max < node.val < right_min
            ):
                current_size = left_size + right_size + 1
                largest_size = max(largest_size, current_size)

                return (
                    True,
                    current_size,
                    min(left_min, node.val),
                    max(right_max, node.val),
                )

            return False, 0, 0, 0

        dfs(root)
        return largest_size

solution = Solution()

# Example 1
root1 = TreeNode(
    10,
    TreeNode(5, TreeNode(1), TreeNode(8)),
    TreeNode(15, None, TreeNode(7))
)
assert solution.largestBSTSubtree(root1) == 3  # largest BST rooted at 5

# Example 2
root2 = TreeNode(
    4,
    TreeNode(2, TreeNode(2, TreeNode(1)), TreeNode(3)),
    TreeNode(7, TreeNode(5), None)
)
assert solution.largestBSTSubtree(root2) == 2  # duplicate breaks BST

# Empty tree
assert solution.largestBSTSubtree(None) == 0  # no nodes

# Single node
root3 = TreeNode(1)
assert solution.largestBSTSubtree(root3) == 1  # single node is BST

# Entire tree is BST
root4 = TreeNode(
    4,
    TreeNode(2, TreeNode(1), TreeNode(3)),
    TreeNode(6, TreeNode(5), TreeNode(7))
)
assert solution.largestBSTSubtree(root4) == 7  # whole tree valid

# Completely invalid root but valid children
root5 = TreeNode(
    5,
    TreeNode(6),
    TreeNode(4)
)
assert solution.largestBSTSubtree(root5) == 1  # only leaf nodes valid

# Left subtree invalid because of deep violation
root6 = TreeNode(
    10,
    TreeNode(5, TreeNode(1), TreeNode(12)),
    TreeNode(15)
)
assert solution.largestBSTSubtree(root6) == 3  # subtree rooted at 5

# Duplicate values
root7 = TreeNode(
    2,
    TreeNode(2),
    TreeNode(2)
)
assert solution.largestBSTSubtree(root7) == 1  # duplicates invalidate BST

# Skewed increasing tree
root8 = TreeNode(
    1,
    None,
    TreeNode(
        2,
        None,
        TreeNode(
            3,
            None,
            TreeNode(4)
        )
    )
)
assert solution.largestBSTSubtree(root8) == 4  # valid skewed BST
Test Why
Example 1 Verifies standard invalid subtree detection
Example 2 Verifies duplicate handling
Empty tree Ensures base case correctness
Single node Confirms smallest non-empty BST
Entire tree valid Confirms full-tree BST detection
Invalid root with valid children Ensures local invalidity handled correctly
Deep subtree violation Verifies recursive propagation
Duplicate values Confirms strict inequality enforcement
Skewed tree Tests recursion depth and linear structure

Edge Cases

Empty Tree

An empty tree is a special case because there are no nodes at all. A naive implementation might accidentally return 1 or fail due to null pointer access. The implementation handles this naturally through the DFS base case, where a null node returns a valid BST of size 0. Therefore, the final answer remains 0.

Duplicate Values

BST definitions in this problem require strict inequality. This means equal values are not allowed on either side. A common mistake is using <= or >= comparisons, which incorrectly permits duplicates. The implementation explicitly checks:

left_max < node.val < right_min

This guarantees duplicates invalidate the subtree.

Deep Violations Inside Subtrees

Sometimes the immediate children appear valid, but a deeper descendant violates BST ordering. For example:

    10
   /
  5
   \
   12

The node 12 violates the BST property because it belongs in the right subtree of 10, not the left.

A naive parent-child comparison would miss this issue. The algorithm avoids this bug by propagating subtree minimum and maximum values upward. This ensures every node validates the entire subtree range, not just immediate children.

Highly Skewed Trees

A tree that behaves like a linked list can create worst-case recursion depth. The algorithm still remains correct because each node is processed once. Time complexity stays linear, although recursion depth becomes O(n) in the worst case.