LeetCode 1597 - Build Binary Expression Tree From Infix Expression

The problem asks us to construct a binary expression tree from a valid infix arithmetic expression string. An infix expr

LeetCode Problem 1597

Difficulty: 🔴 Hard
Topics: String, Stack, Tree, Binary Tree

Solution

Problem Understanding

The problem asks us to construct a binary expression tree from a valid infix arithmetic expression string.

An infix expression is the standard mathematical notation where operators appear between operands, such as 3+4 or 2*(5-1). The input string may contain:

  • Single digit operands
  • Operators: +, -, *, /
  • Parentheses ( and )

The output must be a binary expression tree where:

  • Leaf nodes represent operands
  • Internal nodes represent operators
  • Every internal node has exactly two children
  • The tree evaluates according to normal arithmetic precedence rules

The important requirement is that the inorder traversal of the resulting tree must reproduce the original expression after removing parentheses.

This means the tree structure must preserve:

  1. Operand ordering
  2. Operator precedence
  3. Associativity implied by the expression

For example, in the expression:

3*4-2*5

multiplication has higher precedence than subtraction, so the tree must represent:

(3*4) - (2*5)

rather than:

3 * (4-2) * 5

The constraints are relatively small:

  • 1 <= s.length <= 100
  • Operands are single digits
  • The expression is always valid

Because the input size is small, many parsing approaches are feasible. However, the challenge is correctly handling precedence and parentheses while building the tree structure directly.

Several edge cases are important:

  • Expressions containing deeply nested parentheses
  • Multiple operators with the same precedence
  • Left associativity such as 1-2-3
  • Single operand expressions like "7"
  • Long chains of operators such as "1+2+3+4+5"

A naive implementation can easily build an incorrect tree if it ignores precedence or associativity.

Approaches

Brute Force Approach

A brute force strategy would recursively split the expression string at every operator position and attempt to construct all possible binary trees.

For each substring:

  1. Try every operator as the root
  2. Recursively build left and right subtrees
  3. Validate whether the resulting tree preserves precedence and evaluates correctly

This works because every valid binary expression tree corresponds to some recursive partitioning of the infix expression.

However, the number of possible binary tree structures grows exponentially. Many recursive partitions are invalid because they violate precedence rules or parentheses grouping.

Additionally, validating each candidate tree requires extra computation.

This approach becomes impractical even for moderate input sizes.

Optimal Approach, Stack Based Expression Parsing

The better solution uses a classic operator precedence parsing technique with two stacks:

  • One stack for operands and subtree nodes
  • One stack for operators

This approach is closely related to the Shunting Yard Algorithm.

The key insight is that infix expressions can be parsed correctly by processing operators according to precedence and parentheses rules.

When an operator with lower or equal precedence is encountered, previously pending operators must be evaluated first. This naturally constructs the correct expression tree structure.

For example:

3*4-2

When - is encountered:

  • * already has higher precedence
  • Therefore 3*4 must become a subtree before processing subtraction

This allows us to construct the expression tree incrementally in a single left-to-right pass.

Approach Time Complexity Space Complexity Notes
Brute Force Exponential Exponential Tries many invalid tree structures
Optimal O(n) O(n) Uses operator precedence with stacks

Algorithm Walkthrough

Step 1: Create Two Stacks

We maintain:

  • node_stack for operands and partial expression trees
  • operator_stack for operators and parentheses

The node stack stores fully constructed subtrees.

The operator stack delays operations until precedence rules determine they should be processed.

Step 2: Process the Expression Left to Right

We iterate through each character in the string.

There are four possible cases.

Step 3: Handle Operands

If the current character is a digit:

  1. Create a new tree node
  2. Push it onto node_stack

Example:

Input: 3

We create:

Node("3")

and push it onto the stack.

Step 4: Handle Opening Parentheses

If we encounter (:

  • Push it onto operator_stack

This acts as a boundary marker so operators inside the parentheses are processed independently.

Step 5: Handle Closing Parentheses

When ) is encountered:

  1. Repeatedly process operators until ( is found
  2. Remove the ( from the stack

This ensures the entire parenthesized expression becomes a completed subtree.

Step 6: Handle Operators

When encountering +, -, *, or /:

  1. Compare precedence with the top operator in the stack
  2. While the top operator has higher or equal precedence:
  • Build a subtree
  1. Push the current operator

This enforces correct precedence and left associativity.

Step 7: Build a Subtree

To process an operator:

  1. Pop the operator
  2. Pop the right subtree
  3. Pop the left subtree
  4. Create a new operator node
  5. Attach left and right children
  6. Push the new subtree back onto node_stack

Example:

3 * 4

becomes:

    *
   / \
  3   4

Step 8: Finish Remaining Operators

After scanning the full string:

  • Process all remaining operators in the stack

The final node remaining in node_stack is the root of the expression tree.

Why it works

The algorithm maintains an important invariant:

  • Every operand or completed subtree is stored in node_stack
  • Operators are only processed when precedence rules guarantee they should bind immediately

Because operators are processed in exactly the same order arithmetic evaluation would occur, the constructed tree preserves both precedence and associativity.

Parentheses work naturally because they delay processing until the enclosed subexpression is complete.

Python Solution

# Definition for a binary tree node.
# class Node(object):
#     def __init__(self, val=" ", left=None, right=None):
#         self.val = val
#         self.left = left
#         self.right = right

class Solution:
    def expTree(self, s: str) -> 'Node':

        def precedence(op: str) -> int:
            if op in ('+', '-'):
                return 1
            if op in ('*', '/'):
                return 2
            return 0

        def build():
            operator = operator_stack.pop()

            right = node_stack.pop()
            left = node_stack.pop()

            node = Node(operator)
            node.left = left
            node.right = right

            node_stack.append(node)

        node_stack = []
        operator_stack = []

        for ch in s:

            if ch.isdigit():
                node_stack.append(Node(ch))

            elif ch == '(':
                operator_stack.append(ch)

            elif ch == ')':

                while operator_stack and operator_stack[-1] != '(':
                    build()

                operator_stack.pop()

            else:

                while (
                    operator_stack
                    and operator_stack[-1] != '('
                    and precedence(operator_stack[-1]) >= precedence(ch)
                ):
                    build()

                operator_stack.append(ch)

        while operator_stack:
            build()

        return node_stack[-1]

The implementation follows the exact algorithm described earlier.

The precedence() helper determines operator priority. Multiplication and division receive higher precedence than addition and subtraction.

The build() helper creates a subtree from the top operator and the two most recent nodes. The right subtree is popped first because stacks reverse operand order.

During the main traversal:

  • Digits immediately become leaf nodes
  • Parentheses control grouping boundaries
  • Operators are processed according to precedence

The condition:

precedence(operator_stack[-1]) >= precedence(ch)

is crucial because it enforces left associativity.

For example:

1-2-3

must become:

(1-2)-3

rather than:

1-(2-3)

At the end, all remaining operators are processed, leaving exactly one tree root.

Go Solution

/**
 * Definition for a binary tree node.
 * type Node struct {
 *     Val string
 *     Left *Node
 *     Right *Node
 * }
 */

type Solution struct{}

func (s *Solution) expTree(expression string) *Node {

    precedence := func(op byte) int {
        if op == '+' || op == '-' {
            return 1
        }
        if op == '*' || op == '/' {
            return 2
        }
        return 0
    }

    nodeStack := []*Node{}
    operatorStack := []byte{}

    build := func() {
        operator := operatorStack[len(operatorStack)-1]
        operatorStack = operatorStack[:len(operatorStack)-1]

        right := nodeStack[len(nodeStack)-1]
        nodeStack = nodeStack[:len(nodeStack)-1]

        left := nodeStack[len(nodeStack)-1]
        nodeStack = nodeStack[:len(nodeStack)-1]

        node := &Node{
            Val: string(operator),
            Left: left,
            Right: right,
        }

        nodeStack = append(nodeStack, node)
    }

    for i := 0; i < len(expression); i++ {

        ch := expression[i]

        if ch >= '0' && ch <= '9' {

            nodeStack = append(nodeStack, &Node{
                Val: string(ch),
            })

        } else if ch == '(' {

            operatorStack = append(operatorStack, ch)

        } else if ch == ')' {

            for len(operatorStack) > 0 &&
                operatorStack[len(operatorStack)-1] != '(' {

                build()
            }

            operatorStack = operatorStack[:len(operatorStack)-1]

        } else {

            for len(operatorStack) > 0 &&
                operatorStack[len(operatorStack)-1] != '(' &&
                precedence(operatorStack[len(operatorStack)-1]) >= precedence(ch) {

                build()
            }

            operatorStack = append(operatorStack, ch)
        }
    }

    for len(operatorStack) > 0 {
        build()
    }

    return nodeStack[len(nodeStack)-1]
}

The Go implementation mirrors the Python solution closely.

The main difference is explicit slice manipulation for stack operations.

Go uses pointers for tree nodes, so subtree construction naturally shares references without copying data.

Because operands are guaranteed to be single digits and the expression length is small, there are no integer overflow concerns.

Worked Examples

Example 1

Input:

3*4-2*5

Processing steps:

Character Node Stack Operator Stack
3 [3] []
* [3] [*]
4 [3,4] [*]
- [3*4] [-]
2 [3*4,2] [-]
* [3*4,2] [-,*]
5 [3*4,2,5] [-,*]

Final reductions:

  1. Build 2*5
  2. Build (3*4)-(2*5)

Final tree:

        -
       / \
      *   *
     / \ / \
    3  4 2  5

Example 2

Input:

2-3/(5*2)+1

Key parsing events:

Character Action
2 Push operand
- Push operator
3 Push operand
/ Higher precedence than -, push
( Push parenthesis
5 Push operand
* Push operator
2 Push operand
) Build 5*2
+ Resolve / then -
1 Push operand

Final structure:

        +
       / \
      -   1
     / \
    2   /
       / \
      3   *
         / \
        5   2

Example 3

Input:

1+2+3+4+5

Because + is left associative, the tree becomes progressively left-heavy.

Step Tree Built
1+2 (1+2)
+3 ((1+2)+3)
+4 (((1+2)+3)+4)
+5 ((((1+2)+3)+4)+5)

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character and operator is processed once
Space O(n) Stacks may store all operands/operators

Each operand is pushed exactly once.

Each operator is:

  • Pushed once
  • Popped once

Therefore the total amount of work is linear in the expression length.

The stacks can each grow to size O(n) in the worst case, particularly for deeply nested expressions.

Test Cases

# Basic multiplication and subtraction
assert Solution().expTree("3*4-2*5") is not None

# Parentheses handling
assert Solution().expTree("2-3/(5*2)+1") is not None

# Left associativity
assert Solution().expTree("1+2+3+4+5") is not None

# Single operand
assert Solution().expTree("7") is not None

# Nested parentheses
assert Solution().expTree("((1+2)*3)") is not None

# Mixed precedence
assert Solution().expTree("1+2*3") is not None

# Division before subtraction
assert Solution().expTree("8-4/2") is not None

# Deep nesting
assert Solution().expTree("1*(2+(3*(4+5)))") is not None

# Consecutive multiplications
assert Solution().expTree("2*3*4") is not None

# Consecutive additions
assert Solution().expTree("1+2+3") is not None
Test Why
3*4-2*5 Verifies operator precedence
2-3/(5*2)+1 Verifies nested parentheses
1+2+3+4+5 Verifies left associativity
7 Smallest valid expression
((1+2)*3) Tests multiple parenthesis layers
1+2*3 Ensures multiplication binds first
8-4/2 Verifies division precedence
1*(2+(3*(4+5))) Tests deep recursive structure
2*3*4 Checks left associative multiplication
1+2+3 Checks repeated low precedence operators

Edge Cases

Single Operand Expression

An expression like:

"7"

contains no operators at all.

A buggy implementation might assume every valid expression contains at least one operator and fail when attempting to pop operands during subtree construction.

This implementation handles the case naturally because the single digit node is pushed onto node_stack, and no operators are ever processed. The final node is returned directly.

Deeply Nested Parentheses

Expressions such as:

"1*(2+(3*(4+5)))"

can easily break incorrect precedence handling.

Without proper parenthesis tracking, operators outside the nested expression could be processed too early.

The algorithm prevents this by pushing ( onto the operator stack and refusing to process beyond it until the matching ) appears.

Left Associativity

Expressions like:

"1-2-3"

must parse as:

(1-2)-3

rather than:

1-(2-3)

This is subtle but important.

The condition:

precedence(top) >= precedence(current)

ensures operators with equal precedence are processed immediately, enforcing left associativity correctly.