LeetCode 127 - Word Ladder
This problem asks us to determine the length of the shortest transformation sequence between two words, beginWord and endWord, under a strict transformation rule.
Difficulty: 🔴 Hard
Topics: Hash Table, String, Breadth-First Search
Solution
Problem Understanding
This problem asks us to determine the length of the shortest transformation sequence between two words, beginWord and endWord, under a strict transformation rule.
A valid transformation changes exactly one character at a time, and every intermediate word must exist inside the provided wordList. The beginWord itself does not need to appear in the dictionary, but every word after it in the transformation sequence must be valid according to the dictionary.
The goal is not to generate all possible sequences. Instead, we only care about the minimum number of words in the shortest valid path from beginWord to endWord.
For example, if:
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]
One valid transformation is:
hit → hot → dot → dog → cog
This sequence contains 5 words, so the answer is 5.
The problem becomes impossible if endWord is not present in wordList, because every transformed word must belong to the dictionary. In that case, the answer is immediately 0.
The constraints reveal important information about the scale of the problem:
- Word length is at most
10 wordListcontains up to5000unique words- All words have equal length
- Only lowercase English letters are used
A naive graph construction that compares every pair of words would become expensive at this scale. Since every word can potentially connect to many others, we need a more efficient strategy for finding neighboring words.
The key observation is that this problem is fundamentally a shortest path problem in an unweighted graph. Each word is a node, and two words share an edge if they differ by exactly one letter.
Several edge cases can trip up a naive implementation. If endWord is missing from the dictionary, the answer must be 0. If there is no possible chain connecting the two words, we also return 0. Another subtle case is when many words differ by only one letter, which can create a large branching factor and make inefficient neighbor generation prohibitively expensive.
Approaches
Brute Force Approach
The brute force idea is to explicitly model the problem as a graph.
We treat every word as a node and compare every pair of words to determine whether they differ by exactly one character. If they do, we connect them with an edge. We must also include beginWord in this graph.
After building the graph, we run Breadth-First Search (BFS) starting from beginWord. Since BFS explores nodes level by level, the first time we encounter endWord, we know we have found the shortest transformation sequence.
This approach is correct because every edge represents a single valid transformation, and BFS guarantees the shortest path in an unweighted graph.
However, graph construction is expensive. If there are N words and each word has length L, checking every pair requires:
O(N² × L)
For 5000 words, this becomes too slow.
Optimal Approach, BFS with Pattern Mapping
Instead of comparing every pair of words, we can generate neighbors efficiently.
The key insight is that two words differ by one letter if they share a common intermediate pattern.
For example:
hot → *ot, h*t, ho*
dot → *ot, d*t, do*
lot → *ot, l*t, lo*
Notice that "hot", "dot", and "lot" all share the pattern *ot, meaning they are one transformation apart.
We preprocess all words and map each wildcard pattern to matching words.
Example:
*ot → [hot, dot, lot]
h*t → [hot]
do* → [dot, dog]
Now, instead of checking every word during BFS, we can efficiently retrieve neighbors using these patterns.
At each BFS step:
- Generate wildcard patterns for the current word.
- Use the pattern map to retrieve neighbors.
- Visit unprocessed words.
- Continue level-by-level until reaching
endWord.
This reduces neighbor lookup dramatically and allows BFS to remain efficient even for larger dictionaries.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(N² × L) | O(N²) | Explicitly builds the entire graph |
| Optimal | O(N × L²) | O(N × L) | Uses wildcard pattern mapping with BFS |
Where:
N= number of wordsL= word length
Algorithm Walkthrough
Optimal Algorithm, BFS with Wildcard Pattern Mapping
- Convert
wordListinto a set and verifyendWordexists
Before doing any work, check whether endWord is inside the dictionary. If it is missing, return 0 immediately because a valid transformation is impossible.
2. Build the wildcard pattern map
For every word in the dictionary, generate all possible wildcard versions by replacing one character at a time with '*'.
Example:
hot → *ot, h*t, ho*
Store these patterns inside a hash map:
pattern → list of words
This preprocessing step allows us to efficiently find neighboring words later. 3. Initialize BFS
Create a queue containing:
(beginWord, 1)
The number 1 represents the transformation sequence length, because beginWord itself counts as part of the sequence.
Also create a visited set to avoid revisiting words and creating cycles.
4. Process the queue level by level
While the queue is not empty:
- Remove the front word.
- If it matches
endWord, return the current sequence length. - Generate wildcard patterns for the current word.
- Retrieve neighboring words from the pattern map.
- Add unseen neighbors to the queue with distance
+1.
- Mark words as visited immediately
As soon as a word enters the queue, mark it visited.
This prevents duplicate processing and guarantees BFS explores the shortest route first.
6. Return 0 if BFS ends
If the queue becomes empty without reaching endWord, no valid transformation exists.
Why it works
The correctness relies on the core property of Breadth-First Search.
BFS explores nodes level by level. Every BFS level represents words reachable using the same number of transformations. Since every valid transformation has equal cost, the first time we encounter endWord, it must be through the shortest possible sequence.
The wildcard pattern map guarantees we efficiently discover all valid neighboring words without missing any possibilities.
Python Solution
from collections import defaultdict, deque
from typing import List
class Solution:
def ladderLength(
self,
beginWord: str,
endWord: str,
wordList: List[str]
) -> int:
if endWord not in wordList:
return 0
word_length = len(beginWord)
pattern_map = defaultdict(list)
for word in wordList:
for i in range(word_length):
pattern = word[:i] + "*" + word[i + 1:]
pattern_map[pattern].append(word)
queue = deque([(beginWord, 1)])
visited = {beginWord}
while queue:
current_word, steps = queue.popleft()
if current_word == endWord:
return steps
for i in range(word_length):
pattern = (
current_word[:i]
+ "*"
+ current_word[i + 1:]
)
for neighbor in pattern_map[pattern]:
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, steps + 1))
pattern_map[pattern] = []
return 0
The implementation starts with an early exit condition. If endWord is missing from the dictionary, we immediately return 0.
Next, we preprocess the dictionary into a wildcard pattern map. For every word, we create all possible single-character wildcard transformations and map them to the original word.
The BFS queue stores tuples of:
(current_word, sequence_length)
This allows us to track how many transformations have been made so far.
The visited set prevents cycles and duplicate work. We mark words as visited immediately after enqueueing them, ensuring each word is processed only once.
A useful optimization appears here:
pattern_map[pattern] = []
Once a wildcard pattern has been processed, we clear its neighbors. This avoids repeatedly traversing the same adjacency list during future BFS steps and improves efficiency.
When endWord is encountered, we immediately return the transformation length. If BFS finishes without finding it, the answer is 0.
Go Solution
func ladderLength(beginWord string, endWord string, wordList []string) int {
wordExists := false
for _, word := range wordList {
if word == endWord {
wordExists = true
break
}
}
if !wordExists {
return 0
}
wordLength := len(beginWord)
patternMap := make(map[string][]string)
for _, word := range wordList {
for i := 0; i < wordLength; i++ {
pattern := word[:i] + "*" + word[i+1:]
patternMap[pattern] = append(
patternMap[pattern],
word,
)
}
}
type Pair struct {
word string
steps int
}
queue := []Pair{
{beginWord, 1},
}
visited := map[string]bool{
beginWord: true,
}
for len(queue) > 0 {
current := queue[0]
queue = queue[1:]
if current.word == endWord {
return current.steps
}
for i := 0; i < wordLength; i++ {
pattern := current.word[:i] +
"*" +
current.word[i+1:]
neighbors := patternMap[pattern]
for _, neighbor := range neighbors {
if !visited[neighbor] {
visited[neighbor] = true
queue = append(queue, Pair{
word: neighbor,
steps: current.steps + 1,
})
}
}
patternMap[pattern] = []string{}
}
}
return 0
}
The Go solution follows the same algorithmic structure as Python but uses Go-native data structures.
Since Go does not have a built-in queue, a slice is used to simulate BFS queue behavior. We remove elements using:
queue = queue[1:]
The visited structure is implemented using:
map[string]bool
Go strings are immutable, so substring operations remain safe and efficient for this problem size.
No integer overflow concerns exist here because the maximum path length is bounded by the number of words, which is at most 5000.
Worked Examples
Example 1
Input:
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]
Wildcard map:
| Pattern | Words |
|---|---|
*ot |
hot, dot, lot |
h*t |
hot |
ho* |
hot |
d*t |
dot |
do* |
dot, dog |
*og |
dog, log, cog |
Initial queue:
| Queue | Visited |
|---|---|
[(hit,1)] |
{hit} |
Step 1
Process:
hit
Patterns:
*it
h*t
hi*
Neighbor found:
hot
Queue becomes:
| Queue | Visited |
|---|---|
[(hot,2)] |
{hit, hot} |
Step 2
Process:
hot
Neighbors:
dot
lot
Queue becomes:
| Queue | Visited |
|---|---|
[(dot,3), (lot,3)] |
{hit, hot, dot, lot} |
Step 3
Process:
dot
Neighbor:
dog
Queue:
| Queue | Visited |
|---|---|
[(lot,3), (dog,4)] |
{hit, hot, dot, lot, dog} |
Step 4
Process:
lot
Neighbor:
log
Queue:
| Queue | Visited |
|---|---|
[(dog,4), (log,4)] |
{hit, hot, dot, lot, dog, log} |
Step 5
Process:
dog
Neighbor:
cog
Queue:
| Queue | Visited |
|---|---|
[(log,4), (cog,5)] |
{hit, hot, dot, lot, dog, log, cog} |
Step 6
Process:
cog
Reached target.
Answer:
5
Example 2
Input:
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log"]
Since:
"cog" not in wordList
We immediately return:
0
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(N × L²) | Each word generates L patterns, each pattern creation costs O(L) |
| Space | O(N × L) | Pattern map, queue, and visited set |
Let N be the number of words and L be the word length.
Building the wildcard map requires generating L patterns for each word, and each substring operation costs O(L), resulting in O(N × L²) preprocessing time.
During BFS, each word is visited at most once, and pattern lists are cleared after use, preventing repeated work. This keeps traversal efficient and preserves the same asymptotic complexity.
Test Cases
solution = Solution()
# Example 1
assert solution.ladderLength(
"hit",
"cog",
["hot", "dot", "dog", "lot", "log", "cog"]
) == 5 # Standard valid transformation
# Example 2
assert solution.ladderLength(
"hit",
"cog",
["hot", "dot", "dog", "lot", "log"]
) == 0 # endWord missing
# Direct transformation
assert solution.ladderLength(
"hit",
"hot",
["hot"]
) == 2 # Single transformation
# No valid path despite endWord existing
assert solution.ladderLength(
"hit",
"cog",
["hot", "dot", "tod", "hog", "cog"]
) == 0 # Disconnected graph
# Multiple shortest paths
assert solution.ladderLength(
"hit",
"cog",
["hot", "dot", "dog", "lot", "log", "cog"]
) == 5 # Several shortest routes
# Smallest valid dictionary
assert solution.ladderLength(
"a",
"c",
["a", "b", "c"]
) == 2 # Single-character words
# Larger branching factor
assert solution.ladderLength(
"red",
"tax",
["ted", "tex", "red", "tax", "tad", "den", "rex", "pee"]
) == 4 # Multiple branching choices
| Test | Why |
|---|---|
| Example 1 | Verifies standard shortest transformation |
| Example 2 | Verifies immediate failure when endWord is absent |
| Direct transformation | Tests one-step conversion |
| No valid path | Ensures disconnected graph returns 0 |
| Multiple shortest paths | Confirms BFS finds minimum distance |
| Smallest valid dictionary | Tests minimal word length |
| Larger branching factor | Verifies BFS correctness under many choices |
Edge Cases
endWord Does Not Exist in the Dictionary
This is the most important edge case. Since every transformed word must belong to wordList, reaching an endWord outside the dictionary is impossible.
A naive implementation might still run BFS unnecessarily, wasting time. Our implementation explicitly checks this condition at the start:
if endWord not in wordList:
return 0
This guarantees immediate termination.
No Valid Transformation Path Exists
Even if endWord exists, there may be no chain connecting beginWord to it.
For example:
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","tod","hog","cog"]
The graph is disconnected, so BFS eventually exhausts all possibilities and returns 0.
Our implementation naturally handles this by finishing traversal and returning:
0
when the queue becomes empty.
Many Similar Words Causing Excessive Reprocessing
A dictionary with many overlapping patterns can cause repeated neighbor scanning.
For example:
hot, dot, lot, pot, cot, got
all share:
*ot
Without optimization, the same adjacency list might be scanned repeatedly for every BFS node.
Our implementation avoids this inefficiency by clearing processed pattern entries:
pattern_map[pattern] = []
This ensures each wildcard group is traversed at most once, significantly improving performance for dense graphs.