LeetCode 126 - Word Ladder II

This problem asks us to find all shortest transformation sequences from beginWord to endWord, where each transformation changes exactly one character and every intermediate word must exist in wordList. We can think of the problem as navigating through a graph.

LeetCode Problem 126

Difficulty: 🔴 Hard
Topics: Hash Table, String, Backtracking, Breadth-First Search

Solution

Problem Understanding

This problem asks us to find all shortest transformation sequences from beginWord to endWord, where each transformation changes exactly one character and every intermediate word must exist in wordList.

We can think of the problem as navigating through a graph. Each word is a node, and two words are connected if they differ by exactly one character. Starting from beginWord, we want to reach endWord using the minimum number of transformations. However, unlike the simpler Word Ladder problem, we are not asked only for the shortest distance. We must return every possible shortest path.

For example, consider:

hit -> hot -> dot -> dog -> cog

and

hit -> hot -> lot -> log -> cog

Both paths reach "cog" using the same minimum number of transformations, so both must be included in the answer.

The input consists of:

  • beginWord, the starting word.
  • endWord, the target word.
  • wordList, a dictionary of valid intermediate words.

The output is a list of sequences, where each sequence represents a valid shortest transformation from beginWord to endWord.

The constraints tell us several important things:

  • Word lengths are small, at most 5.
  • The number of words is at most 500.
  • All words are lowercase English letters.
  • The total number of shortest sequences does not exceed 10^5.

Although 500 words is not extremely large, the number of possible transformation paths can grow exponentially. A naive search that explores every possible path would quickly become impractical.

A particularly important guarantee is that wordList contains unique words. Another important detail is that beginWord does not need to exist in wordList.

There are several edge cases to consider:

  • If endWord is not in wordList, no valid transformation is possible.
  • Multiple shortest paths may exist and all must be returned.
  • Cycles can occur if we revisit words carelessly.
  • Longer valid paths should not be included if shorter ones exist.
  • beginWord and endWord are guaranteed to be different.

Approaches

Brute Force Approach

A straightforward idea is to treat this as a backtracking problem. Starting from beginWord, we try every possible one-letter transformation that exists in the dictionary, recursively exploring all valid paths until we reach endWord.

To avoid infinite loops, we maintain a visited set for the current path. Whenever we reach endWord, we compare the path length against the shortest path found so far. If the current path is shorter, we replace the results. If it ties the shortest length, we add it to the result list.

This approach is correct because it explores every possible valid transformation sequence. Eventually, all shortest paths are discovered.

However, it is extremely inefficient. Since each word may branch into many neighbors, the search tree grows exponentially. In the worst case, we explore an enormous number of unnecessary paths before determining the shortest one.

The key inefficiency is that brute force does not prioritize shorter paths. It explores long paths even when a shorter route already exists.

Key Insight for an Optimal Solution

The critical observation is that Breadth First Search (BFS) naturally discovers shortest paths in an unweighted graph.

Since every transformation costs exactly one step, the word graph is an unweighted graph. BFS explores nodes level by level, meaning the first time we reach endWord, we know we are at the shortest distance.

However, there is a complication. We do not want just one shortest path, we want all shortest paths.

Instead of storing a single predecessor for each word, we store all valid parents discovered at the same shortest level. Then, after BFS finishes, we perform DFS backtracking from endWord back to beginWord to reconstruct every shortest sequence.

This gives us the best of both worlds:

  • BFS guarantees shortest distance.
  • DFS reconstructs every shortest path.
Approach Time Complexity Space Complexity Notes
Brute Force Exponential Exponential Explores every transformation path
Optimal O(N × L × 26 + P) O(N + P) BFS for shortest paths, DFS for reconstruction

Where:

  • N is the number of words.
  • L is word length.
  • P is the total size of all shortest sequences.

Algorithm Walkthrough

Step 1: Validate the Input

First, we convert wordList into a hash set for constant time lookup.

If endWord does not exist in this set, we immediately return an empty list. Since every intermediate word must come from wordList, reaching the destination becomes impossible.

Step 2: Prepare BFS Structures

We initialize a queue for BFS starting from beginWord.

We also maintain:

  • parents, a hash map where each word stores all predecessor words that can reach it through a shortest path.
  • visited, a set of globally visited words.
  • found_end, a boolean flag indicating whether we reached endWord.

The parents map is the most important structure. Unlike traditional BFS where we store only one parent, here we store multiple parents because several shortest paths may converge to the same word.

For example:

dog -> cog
log -> cog

Both "dog" and "log" are parents of "cog".

Step 3: Run BFS Level by Level

We process the queue one level at a time.

For every word:

  1. Change each character position.
  2. Try replacing it with all 26 lowercase letters.
  3. Generate a candidate transformed word.
  4. Check if the transformed word exists in the dictionary.

If the transformed word is valid:

  • If it has not been visited globally, add it to the current level.
  • Record the current word as a parent.

Importantly, we delay marking nodes as globally visited until the entire BFS level finishes.

This prevents us from accidentally removing alternative shortest paths discovered at the same depth.

For example:

dot -> dog
lot -> log
dog -> cog
log -> cog

If "cog" were marked visited immediately after "dog" discovers it, "log" would lose the opportunity to register as another shortest parent.

Step 4: Stop After Shortest Level

Once endWord is found, we stop BFS after finishing the current level.

We do not continue deeper because BFS guarantees that any deeper level would produce longer paths.

Step 5: Backtrack Using DFS

Now we reconstruct all shortest sequences.

Starting from endWord, we recursively move backward using the parents map until reaching beginWord.

Because reconstruction happens backwards, we reverse the path before storing it.

For example:

cog -> dog -> dot -> hot -> hit

becomes:

hit -> hot -> dot -> dog -> cog

Why it works

The correctness comes from the BFS property that nodes are explored in increasing order of distance. Therefore, the first level that reaches endWord must represent the minimum transformation length.

By storing all valid parents discovered at the same BFS depth, we preserve every shortest route. DFS reconstruction then enumerates exactly those shortest paths and nothing longer.

Python Solution

from typing import List
from collections import defaultdict, deque

class Solution:
    def findLadders(
        self,
        beginWord: str,
        endWord: str,
        wordList: List[str]
    ) -> List[List[str]]:

        word_set = set(wordList)

        if endWord not in word_set:
            return []

        parents = defaultdict(list)
        queue = deque([beginWord])
        visited = {beginWord}
        found_end = False

        while queue and not found_end:
            level_visited = set()

            for _ in range(len(queue)):
                current_word = queue.popleft()
                word_chars = list(current_word)

                for i in range(len(word_chars)):
                    original_char = word_chars[i]

                    for char_code in range(26):
                        new_char = chr(ord('a') + char_code)

                        if new_char == original_char:
                            continue

                        word_chars[i] = new_char
                        next_word = ''.join(word_chars)

                        if next_word not in word_set:
                            continue

                        if next_word not in visited:
                            if next_word not in level_visited:
                                queue.append(next_word)
                                level_visited.add(next_word)

                            parents[next_word].append(current_word)

                            if next_word == endWord:
                                found_end = True

                    word_chars[i] = original_char

            visited.update(level_visited)

        if not found_end:
            return []

        result = []
        path = [endWord]

        def backtrack(word: str) -> None:
            if word == beginWord:
                result.append(path[::-1])
                return

            for parent in parents[word]:
                path.append(parent)
                backtrack(parent)
                path.pop()

        backtrack(endWord)

        return result

The implementation starts by converting wordList into a set for efficient membership checks.

The BFS phase builds the parents graph. For every word, we generate neighboring words by modifying each character position with all possible lowercase letters. Whenever a valid transformation is found, we store the relationship in parents.

The use of level_visited is especially important. It ensures that multiple shortest parents discovered during the same BFS level are preserved.

Once BFS completes, the recursive backtrack function reconstructs every shortest path. Starting from endWord, it follows all parent relationships until beginWord is reached.

Go Solution

package main

func findLadders(beginWord string, endWord string, wordList []string) [][]string {
	wordSet := make(map[string]bool)

	for _, word := range wordList {
		wordSet[word] = true
	}

	if !wordSet[endWord] {
		return [][]string{}
	}

	parents := make(map[string][]string)

	queue := []string{beginWord}
	visited := map[string]bool{
		beginWord: true,
	}

	foundEnd := false

	for len(queue) > 0 && !foundEnd {
		levelVisited := make(map[string]bool)
		levelSize := len(queue)

		for i := 0; i < levelSize; i++ {
			currentWord := queue[0]
			queue = queue[1:]

			wordChars := []byte(currentWord)

			for j := 0; j < len(wordChars); j++ {
				originalChar := wordChars[j]

				for c := byte('a'); c <= 'z'; c++ {
					if c == originalChar {
						continue
					}

					wordChars[j] = c
					nextWord := string(wordChars)

					if !wordSet[nextWord] {
						continue
					}

					if !visited[nextWord] {
						if !levelVisited[nextWord] {
							queue = append(queue, nextWord)
							levelVisited[nextWord] = true
						}

						parents[nextWord] = append(
							parents[nextWord],
							currentWord,
						)

						if nextWord == endWord {
							foundEnd = true
						}
					}
				}

				wordChars[j] = originalChar
			}
		}

		for word := range levelVisited {
			visited[word] = true
		}
	}

	if !foundEnd {
		return [][]string{}
	}

	result := [][]string{}
	path := []string{endWord}

	var backtrack func(string)

	backtrack = func(word string) {
		if word == beginWord {
			reversed := make([]string, len(path))

			for i := range path {
				reversed[i] = path[len(path)-1-i]
			}

			result = append(result, reversed)
			return
		}

		for _, parent := range parents[word] {
			path = append(path, parent)
			backtrack(parent)
			path = path[:len(path)-1]
		}
	}

	backtrack(endWord)

	return result
}

The Go implementation closely mirrors the Python version. Since Go does not provide a built in deque, a slice is used as a queue.

Maps are used for constant time lookups:

  • wordSet replaces the Python set.
  • parents stores predecessor relationships.
  • visited and levelVisited track BFS progress.

When reconstructing paths, Go requires manual reversal because slices do not support Python style slicing.

Worked Examples

Example 1

Input:

beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]

BFS Process

Level Queue New Parents
0 ["hit"] hot ← hit
1 ["hot"] dot ← hot, lot ← hot
2 ["dot","lot"] dog ← dot, log ← lot
3 ["dog","log"] cog ← dog, cog ← log

Parent graph becomes:

hot <- hit
dot <- hot
lot <- hot
dog <- dot
log <- lot
cog <- dog, log

DFS Reconstruction

Starting from "cog":

Path 1:

cog -> dog -> dot -> hot -> hit

Reversed:

hit -> hot -> dot -> dog -> cog

Path 2:

cog -> log -> lot -> hot -> hit

Reversed:

hit -> hot -> lot -> log -> cog

Final output:

[
    ["hit", "hot", "dot", "dog", "cog"],
    ["hit", "hot", "lot", "log", "cog"]
]

Example 2

Input:

beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log"]

Since "cog" is missing from wordList, we immediately return:

[]

No BFS is needed.

Complexity Analysis

Measure Complexity Explanation
Time O(N × L × 26 + P) BFS explores words and generates neighbors, DFS outputs shortest paths
Space O(N + P) Parent graph, visited sets, queue, and result storage

Here:

  • N is the number of words.
  • L is word length.
  • P is the total size of all shortest sequences.

For each word processed in BFS, we attempt L × 26 transformations. Since each valid word is visited at most once in BFS, this part remains efficient.

The DFS phase depends on how many shortest paths exist. Since the problem explicitly bounds the total output size to 10^5, reconstruction remains manageable.

Test Cases

solution = Solution()

# Example 1
result = solution.findLadders(
    "hit",
    "cog",
    ["hot", "dot", "dog", "lot", "log", "cog"]
)
expected = [
    ["hit", "hot", "dot", "dog", "cog"],
    ["hit", "hot", "lot", "log", "cog"]
]
assert sorted(result) == sorted(expected)  # multiple shortest paths

# Example 2
assert solution.findLadders(
    "hit",
    "cog",
    ["hot", "dot", "dog", "lot", "log"]
) == []  # endWord missing

# Direct one-step transformation
assert solution.findLadders(
    "hit",
    "hot",
    ["hot"]
) == [["hit", "hot"]]  # immediate neighbor

# Multiple shortest routes
result = solution.findLadders(
    "red",
    "tax",
    ["ted", "tex", "red", "tax", "tad", "den", "rex", "pee"]
)
expected = [
    ["red", "ted", "tad", "tax"],
    ["red", "ted", "tex", "tax"],
    ["red", "rex", "tex", "tax"]
]
assert sorted(result) == sorted(expected)  # several shortest paths

# No valid transformation
assert solution.findLadders(
    "abc",
    "xyz",
    ["abd", "abx", "xbc"]
) == []  # disconnected graph

# Single word dictionary
assert solution.findLadders(
    "aaa",
    "bbb",
    ["bbb"]
) == []  # cannot transform

# Shortest path only
result = solution.findLadders(
    "aaa",
    "bbb",
    ["aab", "abb", "bbb", "aac", "acc", "ccc"]
)
assert result == [["aaa", "aab", "abb", "bbb"]]  # ignore longer path
Test Why
Standard example Validates multiple shortest paths
Missing endWord Ensures early return
One-step transformation Tests smallest valid ladder
Multiple shortest routes Confirms parent tracking correctness
Disconnected graph Ensures impossible cases return empty
Minimal dictionary Tests boundary-like behavior
Longer alternative path Verifies only shortest sequences are returned

Edge Cases

endWord Does Not Exist in wordList

This is one of the most common failure cases. A naive implementation might still run BFS and waste time searching for unreachable transformations.

The implementation handles this immediately with:

if endWord not in word_set:
    return []

This guarantees fast termination.

Multiple Shortest Parents at the Same Level

A subtle bug occurs when a node is marked visited too early.

Suppose:

dog -> cog
log -> cog

If "cog" becomes globally visited immediately after "dog" discovers it, "log" cannot register itself as another valid parent.

The solution fixes this by delaying visitation using level_visited. Nodes are marked globally visited only after the entire BFS layer completes.

Longer Valid Paths Exist

There may be many valid transformation sequences, but only the shortest ones should be returned.

For example:

aaa -> aab -> abb -> bbb
aaa -> aac -> acc -> ccc -> bbb

The second path is valid but longer.

Because BFS explores level by level and stops after the first level reaching endWord, longer paths are never included in the parent graph.

Cycles in the Transformation Graph

Word transformations can naturally create cycles.

For example:

hot -> dot
dot -> hot

Without careful visitation tracking, DFS or BFS could loop forever.

The visited set prevents revisiting already processed BFS levels, ensuring the graph remains acyclic with respect to shortest path construction.