LeetCode 720 - Longest Word in Dictionary
This problem gives us an array of lowercase English words and asks us to find the longest word that can be constructed one character at a time using other words from the same dictionary. More specifically, suppose we have a candidate word like "world".
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Trie, Sorting
Solution
LeetCode 720, Longest Word in Dictionary
Problem Understanding
This problem gives us an array of lowercase English words and asks us to find the longest word that can be constructed one character at a time using other words from the same dictionary.
More specifically, suppose we have a candidate word like "world". For this word to be valid, every prefix formed by removing characters from the end must also exist in the dictionary:
"w""wo""wor""worl""world"
All of these must appear in the input array.
The problem is not asking for the longest word overall. Instead, it asks for the longest word whose entire chain of prefixes also exists in the dictionary.
If multiple words satisfy the condition and have the same maximum length, we return the lexicographically smallest one. Lexicographical order means normal dictionary order, so "apple" comes before "apply" because 'e' < 'y'.
The input consists of:
words, an array of lowercase strings- Each word length is between
1and30 - The number of words is at most
1000
The constraints are relatively small, which means several approaches are feasible. However, we still want an efficient and clean solution.
An important detail is that prefixes must themselves be complete words in the dictionary. A substring existing inside another word does not matter. Only prefix chains matter.
Several edge cases are important to think about:
- A word may exist without its required prefixes
- Multiple longest candidates may exist
- The dictionary may contain only one-letter words
- No valid chain longer than length
1may exist - Words may appear in arbitrary order
For example:
["banana", "ban", "ba", "b"]
Even though "banana" exists, it is invalid because prefixes like "bana" and "banan" do not exist.
The problem guarantees that all words contain only lowercase English letters, which simplifies comparison and storage.
Approaches
Brute Force Approach
A straightforward solution is to examine every word individually and verify whether all of its prefixes exist in the dictionary.
We can first place all words into a hash set for efficient lookup. Then, for each word:
- Generate every prefix from length
1up tolen(word) - 1 - Check whether each prefix exists in the set
- If every prefix exists, the word is valid
- Keep track of the best valid word found so far
This approach works correctly because a word is valid exactly when all of its prefixes are present.
However, the method repeatedly rebuilds prefixes for many words. While still acceptable under these constraints, we can improve the organization and elegance of the solution.
Key Insight for the Optimal Solution
The critical observation is that a word is valid only if its immediate predecessor prefix is already valid.
For example:
"ap"is valid if"a"exists"app"is valid if"ap"is valid"appl"is valid if"app"is valid
This naturally suggests processing words in sorted order.
If we sort words lexicographically, shorter prefixes appear before longer extensions:
"a", "ap", "app", "appl", "apple"
We can maintain a set of valid buildable words:
- A one-character word is automatically valid
- A longer word is valid if
word[:-1]already exists in the valid set
This avoids checking every prefix repeatedly. Instead, each word only checks its immediate predecessor.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n × m²) | O(n) | Checks every prefix of every word independently |
| Optimal | O(n log n + n × m) | O(n) | Uses sorting and incremental validation |
Where:
n= number of wordsm= maximum word length
Algorithm Walkthrough
Optimal Algorithm
- Sort the words lexicographically.
Sorting ensures that shorter prefixes are processed before longer words that depend on them. For example, "app" will appear before "apple".
2. Create a set called buildable.
This set stores all words that can successfully be constructed one character at a time.
3. Initialize an empty string answer.
This variable keeps track of the best valid word found so far. 4. Iterate through each word in sorted order.
For every word, determine whether it can be built from smaller valid words.
5. If the word length is 1, it is automatically buildable.
A single character word has no smaller prefixes to validate.
6. Otherwise, check whether word[:-1] exists in buildable.
This works because if the immediate prefix is buildable, then all earlier prefixes must also already be valid. 7. If the word is valid:
-
Add it to
buildable -
Compare it against
answer -
Update
answerif: -
the word is longer, or
-
the word has equal length but is lexicographically smaller
- After processing all words, return
answer.
Why it works
The algorithm maintains the invariant that every word inside buildable can be constructed one character at a time from other valid words.
Because words are processed in sorted order, when we evaluate a word like "apple", its required predecessor "appl" has already been processed. If "appl" is buildable, then by induction all earlier prefixes are also buildable.
Therefore, checking only word[:-1] is sufficient to guarantee the entire prefix chain exists.
Python Solution
from typing import List
class Solution:
def longestWord(self, words: List[str]) -> str:
words.sort()
buildable = set()
answer = ""
for word in words:
if len(word) == 1 or word[:-1] in buildable:
buildable.add(word)
if len(word) > len(answer):
answer = word
return answer
The implementation begins by sorting the input list lexicographically. This ordering is important because prefixes naturally appear before their longer extensions.
The buildable set stores all words that satisfy the construction rule. Whenever we process a word, we check whether it is valid:
- Single-character words are always valid
- Longer words require their immediate prefix to already exist in
buildable
Once a word is confirmed valid, we insert it into the set.
The answer variable tracks the longest valid word encountered so far. Because the array is lexicographically sorted beforehand, ties automatically favor the smaller lexicographical word. Therefore, we only need to update answer when we find a strictly longer word.
This creates a concise and efficient solution.
Go Solution
package main
import (
"sort"
)
func longestWord(words []string) string {
sort.Strings(words)
buildable := make(map[string]bool)
answer := ""
for _, word := range words {
if len(word) == 1 || buildable[word[:len(word)-1]] {
buildable[word] = true
if len(word) > len(answer) {
answer = word
}
}
}
return answer
}
The Go implementation follows the same logic as the Python version.
The main difference is that Go uses a map[string]bool instead of a Python set. The expression:
buildable[word[:len(word)-1]]
checks whether the prefix already exists.
Go strings support slicing by index, which allows easy extraction of the immediate prefix.
Since the input constraints are small, there are no concerns about integer overflow or excessive memory usage.
Worked Examples
Example 1
Input:
["w","wo","wor","worl","world"]
After sorting:
["w","wo","wor","worl","world"]
| Current Word | Prefix Needed | Valid? | Buildable Set | Answer |
|---|---|---|---|---|
| "w" | none | Yes | {"w"} | "w" |
| "wo" | "w" | Yes | {"w","wo"} | "wo" |
| "wor" | "wo" | Yes | {"w","wo","wor"} | "wor" |
| "worl" | "wor" | Yes | {"w","wo","wor","worl"} | "worl" |
| "world" | "worl" | Yes | {"w","wo","wor","worl","world"} | "world" |
Final answer:
"world"
Example 2
Input:
["a","banana","app","appl","ap","apply","apple"]
After sorting:
["a","ap","app","apple","appl","apply","banana"]
| Current Word | Prefix Needed | Valid? | Buildable Set | Answer |
|---|---|---|---|---|
| "a" | none | Yes | {"a"} | "a" |
| "ap" | "a" | Yes | {"a","ap"} | "ap" |
| "app" | "ap" | Yes | {"a","ap","app"} | "app" |
| "apple" | "appl" | No | unchanged | "app" |
| "appl" | "app" | Yes | {"a","ap","app","appl"} | "appl" |
| "apply" | "appl" | Yes | {"a","ap","app","appl","apply"} | "apply" |
| "banana" | "banan" | No | unchanged | "apply" |
At first glance, it seems "apple" was skipped because "appl" had not yet been processed. This is why sorting order matters carefully.
A better sorted order is:
["a","ap","app","appl","apple","apply","banana"]
Then:
| Current Word | Prefix Needed | Valid? | Answer |
|---|---|---|---|
| "a" | none | Yes | "a" |
| "ap" | "a" | Yes | "ap" |
| "app" | "ap" | Yes | "app" |
| "appl" | "app" | Yes | "appl" |
| "apple" | "appl" | Yes | "apple" |
| "apply" | "appl" | Yes | "apple" |
Both "apple" and "apply" are length 5, but "apple" is lexicographically smaller, so it remains the answer.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n + n × m) | Sorting dominates initially, then each word performs one prefix lookup |
| Space | O(n) | The buildable set stores valid words |
The sorting step costs O(n log n).
After sorting, each word performs:
- One substring operation
- One hash lookup
- Possibly one insertion
Each substring operation costs up to O(m) where m is the maximum word length.
Therefore, the total processing cost is O(n × m) after sorting.
The auxiliary space is O(n) because the set may store every word.
Test Cases
from typing import List
class Solution:
def longestWord(self, words: List[str]) -> str:
words.sort()
buildable = set()
answer = ""
for word in words:
if len(word) == 1 or word[:-1] in buildable:
buildable.add(word)
if len(word) > len(answer):
answer = word
return answer
solution = Solution()
assert solution.longestWord(
["w", "wo", "wor", "worl", "world"]
) == "world" # basic incremental chain
assert solution.longestWord(
["a", "banana", "app", "appl", "ap", "apply", "apple"]
) == "apple" # lexicographical tie breaking
assert solution.longestWord(
["a", "b", "ba", "bca", "bda", "bdca"]
) == "ba" # longer words missing prefixes
assert solution.longestWord(
["x"]
) == "x" # single word input
assert solution.longestWord(
["abc", "ab", "a"]
) == "abc" # complete prefix chain exists
assert solution.longestWord(
["abc"]
) == "" # missing required prefixes
assert solution.longestWord(
["a", "ar", "art", "arti", "artic", "articl", "article"]
) == "article" # long valid chain
assert solution.longestWord(
["a", "banana", "ban", "b"]
) == "a" # invalid long words ignored
assert solution.longestWord(
["k", "ki", "kir", "kira", "kiran", "kiro"]
) == "kiran" # longest valid chain wins
assert solution.longestWord(
["a", "ap", "app", "appl", "apple", "apply"]
) == "apple" # equal length lexicographical comparison
Test Case Summary
| Test | Why |
|---|---|
["w","wo","wor","worl","world"] |
Standard successful build chain |
["a","banana","app","appl","ap","apply","apple"] |
Lexicographical tie handling |
["a","b","ba","bca","bda","bdca"] |
Invalid longer chains |
["x"] |
Smallest possible input |
["abc","ab","a"] |
Full prefix hierarchy |
["abc"] |
No valid prefixes |
["a","ar","art","arti","artic","articl","article"] |
Long valid sequence |
["a","banana","ban","b"] |
Long invalid word ignored |
["k","ki","kir","kira","kiran","kiro"] |
Longest valid chain selection |
["a","ap","app","appl","apple","apply"] |
Equal length tie resolution |
Edge Cases
Words Without Complete Prefix Chains
A very common mistake is checking only whether some prefixes exist instead of requiring all prefixes.
For example:
["banana", "ban", "ba", "b"]
The word "banana" is invalid because prefixes like "bana" and "banan" do not exist.
The implementation avoids this bug by requiring word[:-1] to already be buildable. Since buildability propagates incrementally, every earlier prefix is guaranteed to exist as well.
Lexicographical Tie Breaking
Another tricky case occurs when multiple valid words share the same maximum length.
For example:
["apple", "apply"]
Both may be valid, but the problem requires the lexicographically smaller word.
Sorting the array lexicographically before processing naturally guarantees this behavior. Since "apple" appears before "apply", and we only replace the answer when finding a strictly longer word, "apple" remains the final answer.
Missing One Character Prefix
A word may appear almost valid but fail because exactly one prefix is missing.
Example:
["a", "ap", "apple"]
The word "apple" requires "appl" to exist, but it does not.
A naive implementation might incorrectly assume earlier prefixes are enough. Our solution correctly checks the immediate predecessor prefix:
word[:-1]
Since "appl" is absent, "apple" is rejected immediately.