LeetCode 431 - Encode N-ary Tree to Binary Tree

This problem asks us to design a reversible transformation between two different tree structures: - An N-ary tree, where each node can have any number of children - A binary tree, where each node has at most two children The important requirement is not how the encoding looks…

LeetCode Problem 431

Difficulty: 🔴 Hard
Topics: Tree, Depth-First Search, Breadth-First Search, Design, Binary Tree

Solution

Problem Understanding

This problem asks us to design a reversible transformation between two different tree structures:

  • An N-ary tree, where each node can have any number of children
  • A binary tree, where each node has at most two children

The important requirement is not how the encoding looks internally, but that the process is lossless. After encoding an N-ary tree into a binary tree and then decoding it back, we must recover the exact original N-ary structure.

The input N-ary tree is represented using level-order traversal with null separators between groups of children. For example:

[1,null,3,2,4,null,5,6]

means:

        1
      / | \
     3  2  4
    / \
   5   6

The encoder converts this structure into a binary tree. The decoder must then reconstruct the original N-ary tree exactly.

The problem intentionally does not constrain the encoding format. This means we are free to design any mapping between the two tree types, as long as:

  1. Every N-ary tree can be encoded
  2. Every encoded binary tree can be decoded back correctly
  3. The algorithm is stateless, meaning we cannot rely on global variables or persistent external storage

The constraints are important:

  • Up to 10^4 nodes
  • Tree height up to 1000

This tells us:

  • We need linear-time traversal algorithms
  • Recursive solutions are acceptable in principle, but very deep recursion could approach recursion limits in Python
  • Memory usage should remain proportional to the tree size

Several edge cases are especially important:

  • Empty tree (root = [])
  • Nodes with zero children
  • Extremely skewed trees
  • Trees where nodes have many children
  • Deep trees approaching height 1000

A naive implementation that loses child ordering or cannot distinguish sibling relationships would fail to reconstruct the original structure correctly.

Approaches

Brute Force Approach

A brute-force idea would be to serialize the entire N-ary tree into an auxiliary representation such as:

  • adjacency lists
  • nested arrays
  • preorder traversal with delimiters

Then we could store that serialized information somewhere inside the binary tree nodes.

For example, we might try:

  • storing child counts explicitly
  • introducing artificial delimiter nodes
  • reconstructing the tree from traversal metadata

While such methods can work, they are unnecessarily complicated and often waste memory. Some brute-force approaches also violate the spirit of the problem because they rely on external state or redundant metadata.

Additionally, approaches that repeatedly scan children or reconstruct lists inefficiently can degrade toward quadratic behavior.

Key Insight

The optimal solution uses the classic Left-Child Right-Sibling representation.

The key observation is that an N-ary tree can naturally be represented using only two pointers per node:

  • The left pointer stores the node's first child
  • The right pointer stores the node's next sibling

This works because:

  • Every node's children form an ordered list
  • A linked-list style sibling chain can represent that list
  • The first child gives access to the entire child sequence

For example:

N-ary:
      1
    / | \
   2  3  4

Binary representation:
      1
     /
    2
     \
      3
       \
        4

The left pointer descends into children, and the right pointer moves across siblings.

This representation is elegant because:

  • No extra metadata is needed
  • Encoding and decoding are both linear
  • The transformation is perfectly reversible

Comparison of Approaches

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) May repeatedly scan child structures or store redundant metadata
Optimal O(n) O(h) Uses left-child right-sibling representation efficiently

Algorithm Walkthrough

Encoding Algorithm

  1. Start with the root of the N-ary tree.

If the root is None, return None immediately because an empty tree encodes to an empty binary tree. 2. Create a binary tree node corresponding to the current N-ary node.

The node values remain identical. Only the pointer structure changes. 3. Store the first child of the N-ary node in the binary node's left pointer.

This is the crucial transformation. The left child becomes the entry point into the node's children list. 4. Link all remaining children using right pointers.

Suppose an N-ary node has children:

[c1, c2, c3, c4]

Then the binary structure becomes:

parent.left = c1
c1.right = c2
c2.right = c3
c3.right = c4

This creates a sibling chain. 5. Recursively encode each child subtree.

Each child itself may have children, so we apply the same transformation recursively.

Decoding Algorithm

  1. Start from the binary tree root.

If it is None, return None. 2. Create an N-ary node with the same value. 3. Move to the binary node's left pointer.

This pointer represents the first child in the original N-ary tree. 4. Traverse the sibling chain using right pointers.

Every node encountered represents another child of the current N-ary node. 5. Recursively decode each child subtree.

Append each reconstructed child into the node's children list. 6. Continue until the sibling chain ends.

A None right pointer means there are no more siblings.

Why it works

The algorithm preserves two essential relationships:

  • Parent-child relationships are stored through left
  • Sibling ordering is stored through right

Every N-ary node's ordered child list becomes a linked chain in the binary tree. Since both relationships are preserved exactly, decoding can reconstruct the original tree without ambiguity.

The representation is bijective, meaning every encoded binary tree corresponds to exactly one original N-ary tree.

Python Solution

from typing import Optional, List

# Definition for a Node.
class Node:
    def __init__(self, val: Optional[int] = None, children: Optional[List['Node']] = None):
        self.val = val
        self.children = children if children is not None else []

# Definition for a binary tree node.
class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

class Codec:
    # Encodes an n-ary tree to a binary tree.
    def encode(self, root: 'Optional[Node]') -> Optional[TreeNode]:
        if root is None:
            return None

        binary_root = TreeNode(root.val)

        if not root.children:
            return binary_root

        # Encode first child as left child
        binary_root.left = self.encode(root.children[0])

        current_binary = binary_root.left

        # Remaining children become right siblings
        for child in root.children[1:]:
            current_binary.right = self.encode(child)
            current_binary = current_binary.right

        return binary_root

    # Decodes your binary tree to an n-ary tree.
    def decode(self, data: Optional[TreeNode]) -> 'Optional[Node]':
        if data is None:
            return None

        nary_root = Node(data.val, [])

        current_binary = data.left

        # Traverse sibling chain
        while current_binary:
            child_node = self.decode(current_binary)
            nary_root.children.append(child_node)
            current_binary = current_binary.right

        return nary_root

The implementation directly follows the left-child right-sibling transformation.

The encode method creates a binary node for every N-ary node. The first child is attached to the binary node's left pointer. The remaining children are chained through right pointers.

The decode method reverses this process. It traverses the left child first, then walks through the sibling chain using right pointers to reconstruct the original children list.

The recursive structure is especially clean because each subtree is encoded and decoded independently.

Go Solution

/**
 * Definition for a Node.
 * type Node struct {
 *     Val int
 *     Children []*Node
 * }
 */

/**
 * Definition for a binary tree node.
 * type TreeNode struct {
 *     Val int
 *     Left *TreeNode
 *     Right *TreeNode
 * }
 */

type Codec struct {
    
}

func Constructor() *Codec {
    return &Codec{}
}

func (this *Codec) encode(root *Node) *TreeNode {
    if root == nil {
        return nil
    }

    binaryRoot := &TreeNode{Val: root.Val}

    if len(root.Children) == 0 {
        return binaryRoot
    }

    // First child becomes left child
    binaryRoot.Left = this.encode(root.Children[0])

    currentBinary := binaryRoot.Left

    // Remaining children become right siblings
    for i := 1; i < len(root.Children); i++ {
        currentBinary.Right = this.encode(root.Children[i])
        currentBinary = currentBinary.Right
    }

    return binaryRoot
}

func (this *Codec) decode(root *TreeNode) *Node {
    if root == nil {
        return nil
    }

    naryRoot := &Node{
        Val: root.Val,
        Children: []*Node{},
    }

    currentBinary := root.Left

    for currentBinary != nil {
        child := this.decode(currentBinary)
        naryRoot.Children = append(naryRoot.Children, child)
        currentBinary = currentBinary.Right
    }

    return naryRoot
}

/**
 * Your Codec object will be instantiated and called as such:
 * obj := Constructor();
 * bst := obj.encode(root);
 * ans := obj.decode(bst);
 */

The Go implementation mirrors the Python solution closely.

The primary language-specific differences are:

  • Go uses nil instead of None
  • Children are stored in slices
  • Struct pointers are used explicitly
  • Tree nodes are allocated with &TreeNode{} and &Node{}

No integer overflow concerns exist because node values are bounded by 10^4.

Worked Examples

Example 1

Input:

[1,null,3,2,4,null,5,6]

Original N-ary tree:

        1
      / | \
     3  2  4
    / \
   5   6

Encoding Process

Step Current Node Binary Representation
1 1 Create binary node 1
2 Children of 1 3 becomes left child
3 Remaining siblings 2 becomes right of 3
4 Continue siblings 4 becomes right of 2
5 Process node 3 5 becomes left child of 3
6 Sibling of 5 6 becomes right of 5

Final binary tree:

        1
       /
      3
     / \
    5   2
     \   \
      6   4

Decoding Process

Starting from binary node 1:

  • Left child 3 becomes first N-ary child
  • Right sibling chain reconstructs 2 and 4
  • Node 3 reconstructs children 5 and 6

Recovered tree:

        1
      / | \
     3  2  4
    / \
   5   6

Example 2

Input:

[1,null,2,3,4,5,null,null,6,7,null,8,null,9,10,null,null,11,null,12,null,13,null,null,14]

The algorithm behaves identically regardless of branching factor.

Each node:

  • stores its first child in left
  • links siblings through right

The decoder walks these same relationships to rebuild the original hierarchy.

Example 3

Input:

[]

Encoding:

None -> None

Decoding:

None -> None

This is the simplest edge case and validates null handling.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each node is visited exactly once during encode and once during decode
Space O(h) Recursive call stack depends on tree height

The algorithm is linear because every node participates in constant work during traversal.

The auxiliary space comes only from recursion depth. In the worst case of a highly skewed tree, the height can become O(n). For balanced trees, the recursion depth is much smaller.

Test Cases

from collections import deque

def serialize_nary(root):
    if not root:
        return []

    result = []
    queue = deque([root])

    while queue:
        level_size = len(queue)

        for _ in range(level_size):
            node = queue.popleft()
            result.append(node.val)

            for child in node.children:
                queue.append(child)

            result.append(None)

    return result

codec = Codec()

# Test 1: Empty tree
root = None
encoded = codec.encode(root)
decoded = codec.decode(encoded)
assert decoded is None  # verifies empty tree handling

# Test 2: Single node
root = Node(1, [])
decoded = codec.decode(codec.encode(root))
assert decoded.val == 1
assert decoded.children == []  # verifies leaf node handling

# Test 3: Simple example
root = Node(1, [
    Node(3, [
        Node(5),
        Node(6)
    ]),
    Node(2),
    Node(4)
])

decoded = codec.decode(codec.encode(root))

assert decoded.val == 1
assert len(decoded.children) == 3
assert decoded.children[0].val == 3
assert decoded.children[1].val == 2
assert decoded.children[2].val == 4
assert decoded.children[0].children[0].val == 5
assert decoded.children[0].children[1].val == 6

# Test 4: Deep chain
root = Node(1, [
    Node(2, [
        Node(3, [
            Node(4)
        ])
    ])
])

decoded = codec.decode(codec.encode(root))

assert decoded.children[0].children[0].children[0].val == 4
# verifies deep recursion handling

# Test 5: Wide node
root = Node(1, [
    Node(2),
    Node(3),
    Node(4),
    Node(5),
    Node(6)
])

decoded = codec.decode(codec.encode(root))

assert len(decoded.children) == 5
assert [child.val for child in decoded.children] == [2, 3, 4, 5, 6]
# verifies sibling chain correctness

# Test 6: Mixed structure
root = Node(10, [
    Node(20, [
        Node(40),
        Node(50)
    ]),
    Node(30)
])

decoded = codec.decode(codec.encode(root))

assert decoded.val == 10
assert decoded.children[0].val == 20
assert decoded.children[1].val == 30
assert decoded.children[0].children[0].val == 40
assert decoded.children[0].children[1].val == 50
# verifies mixed branching structures
Test Why
Empty tree Verifies null handling
Single node Ensures leaf nodes work correctly
Standard example Validates core encoding/decoding logic
Deep chain Tests recursion depth behavior
Wide node Verifies sibling linking correctness
Mixed structure Ensures nested and sibling relationships both work

Edge Cases

Empty Tree

The input may be None, representing an empty N-ary tree. This is an important case because recursive algorithms can accidentally dereference null pointers if they assume a node always exists.

The implementation handles this immediately at the beginning of both encode and decode:

if root is None:
    return None

This guarantees safe handling of empty input.

Nodes With No Children

Leaf nodes are another subtle case. If the encoder incorrectly assumes every node has children, it may attempt invalid indexing operations such as:

root.children[0]

when the list is empty.

The implementation explicitly checks:

if not root.children:
    return binary_root

This ensures leaf nodes produce binary nodes with no left child.

Large Number of Siblings

A node may have many children. Incorrect implementations can lose ordering information or overwrite sibling pointers accidentally.

For example:

1 -> [2,3,4,5,6]

must preserve the exact child order.

The solution maintains order by constructing a right-sibling chain sequentially:

current_binary.right = self.encode(child)
current_binary = current_binary.right

During decoding, traversing the right chain reconstructs the children in the original order.

Deep Trees

The tree height can reach 1000, which creates potential recursion depth concerns.

The recursive approach still works within the intended problem constraints, but the recursion stack grows proportionally to tree height. Each recursive call processes exactly one subtree, so no unnecessary stack growth occurs.

Because the algorithm only stores one recursion path at a time, memory usage remains efficient even for deep trees.