LeetCode 720 - Longest Word in Dictionary

This problem gives us an array of lowercase English words and asks us to find the longest word that can be constructed one character at a time using other words from the same dictionary. More specifically, suppose we have a candidate word like "world".

LeetCode Problem 720

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Trie, Sorting

Solution

LeetCode 720, Longest Word in Dictionary

Problem Understanding

This problem gives us an array of lowercase English words and asks us to find the longest word that can be constructed one character at a time using other words from the same dictionary.

More specifically, suppose we have a candidate word like "world". For this word to be valid, every prefix formed by removing characters from the end must also exist in the dictionary:

  • "w"
  • "wo"
  • "wor"
  • "worl"
  • "world"

All of these must appear in the input array.

The problem is not asking for the longest word overall. Instead, it asks for the longest word whose entire chain of prefixes also exists in the dictionary.

If multiple words satisfy the condition and have the same maximum length, we return the lexicographically smallest one. Lexicographical order means normal dictionary order, so "apple" comes before "apply" because 'e' < 'y'.

The input consists of:

  • words, an array of lowercase strings
  • Each word length is between 1 and 30
  • The number of words is at most 1000

The constraints are relatively small, which means several approaches are feasible. However, we still want an efficient and clean solution.

An important detail is that prefixes must themselves be complete words in the dictionary. A substring existing inside another word does not matter. Only prefix chains matter.

Several edge cases are important to think about:

  • A word may exist without its required prefixes
  • Multiple longest candidates may exist
  • The dictionary may contain only one-letter words
  • No valid chain longer than length 1 may exist
  • Words may appear in arbitrary order

For example:

["banana", "ban", "ba", "b"]

Even though "banana" exists, it is invalid because prefixes like "bana" and "banan" do not exist.

The problem guarantees that all words contain only lowercase English letters, which simplifies comparison and storage.

Approaches

Brute Force Approach

A straightforward solution is to examine every word individually and verify whether all of its prefixes exist in the dictionary.

We can first place all words into a hash set for efficient lookup. Then, for each word:

  1. Generate every prefix from length 1 up to len(word) - 1
  2. Check whether each prefix exists in the set
  3. If every prefix exists, the word is valid
  4. Keep track of the best valid word found so far

This approach works correctly because a word is valid exactly when all of its prefixes are present.

However, the method repeatedly rebuilds prefixes for many words. While still acceptable under these constraints, we can improve the organization and elegance of the solution.

Key Insight for the Optimal Solution

The critical observation is that a word is valid only if its immediate predecessor prefix is already valid.

For example:

  • "ap" is valid if "a" exists
  • "app" is valid if "ap" is valid
  • "appl" is valid if "app" is valid

This naturally suggests processing words in sorted order.

If we sort words lexicographically, shorter prefixes appear before longer extensions:

"a", "ap", "app", "appl", "apple"

We can maintain a set of valid buildable words:

  • A one-character word is automatically valid
  • A longer word is valid if word[:-1] already exists in the valid set

This avoids checking every prefix repeatedly. Instead, each word only checks its immediate predecessor.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n × m²) O(n) Checks every prefix of every word independently
Optimal O(n log n + n × m) O(n) Uses sorting and incremental validation

Where:

  • n = number of words
  • m = maximum word length

Algorithm Walkthrough

Optimal Algorithm

  1. Sort the words lexicographically.

Sorting ensures that shorter prefixes are processed before longer words that depend on them. For example, "app" will appear before "apple". 2. Create a set called buildable.

This set stores all words that can successfully be constructed one character at a time. 3. Initialize an empty string answer.

This variable keeps track of the best valid word found so far. 4. Iterate through each word in sorted order.

For every word, determine whether it can be built from smaller valid words. 5. If the word length is 1, it is automatically buildable.

A single character word has no smaller prefixes to validate. 6. Otherwise, check whether word[:-1] exists in buildable.

This works because if the immediate prefix is buildable, then all earlier prefixes must also already be valid. 7. If the word is valid:

  • Add it to buildable

  • Compare it against answer

  • Update answer if:

  • the word is longer, or

  • the word has equal length but is lexicographically smaller

  1. After processing all words, return answer.

Why it works

The algorithm maintains the invariant that every word inside buildable can be constructed one character at a time from other valid words.

Because words are processed in sorted order, when we evaluate a word like "apple", its required predecessor "appl" has already been processed. If "appl" is buildable, then by induction all earlier prefixes are also buildable.

Therefore, checking only word[:-1] is sufficient to guarantee the entire prefix chain exists.

Python Solution

from typing import List

class Solution:
    def longestWord(self, words: List[str]) -> str:
        words.sort()

        buildable = set()
        answer = ""

        for word in words:
            if len(word) == 1 or word[:-1] in buildable:
                buildable.add(word)

                if len(word) > len(answer):
                    answer = word

        return answer

The implementation begins by sorting the input list lexicographically. This ordering is important because prefixes naturally appear before their longer extensions.

The buildable set stores all words that satisfy the construction rule. Whenever we process a word, we check whether it is valid:

  • Single-character words are always valid
  • Longer words require their immediate prefix to already exist in buildable

Once a word is confirmed valid, we insert it into the set.

The answer variable tracks the longest valid word encountered so far. Because the array is lexicographically sorted beforehand, ties automatically favor the smaller lexicographical word. Therefore, we only need to update answer when we find a strictly longer word.

This creates a concise and efficient solution.

Go Solution

package main

import (
	"sort"
)

func longestWord(words []string) string {
	sort.Strings(words)

	buildable := make(map[string]bool)
	answer := ""

	for _, word := range words {
		if len(word) == 1 || buildable[word[:len(word)-1]] {
			buildable[word] = true

			if len(word) > len(answer) {
				answer = word
			}
		}
	}

	return answer
}

The Go implementation follows the same logic as the Python version.

The main difference is that Go uses a map[string]bool instead of a Python set. The expression:

buildable[word[:len(word)-1]]

checks whether the prefix already exists.

Go strings support slicing by index, which allows easy extraction of the immediate prefix.

Since the input constraints are small, there are no concerns about integer overflow or excessive memory usage.

Worked Examples

Example 1

Input:
["w","wo","wor","worl","world"]

After sorting:

["w","wo","wor","worl","world"]
Current Word Prefix Needed Valid? Buildable Set Answer
"w" none Yes {"w"} "w"
"wo" "w" Yes {"w","wo"} "wo"
"wor" "wo" Yes {"w","wo","wor"} "wor"
"worl" "wor" Yes {"w","wo","wor","worl"} "worl"
"world" "worl" Yes {"w","wo","wor","worl","world"} "world"

Final answer:

"world"

Example 2

Input:
["a","banana","app","appl","ap","apply","apple"]

After sorting:

["a","ap","app","apple","appl","apply","banana"]
Current Word Prefix Needed Valid? Buildable Set Answer
"a" none Yes {"a"} "a"
"ap" "a" Yes {"a","ap"} "ap"
"app" "ap" Yes {"a","ap","app"} "app"
"apple" "appl" No unchanged "app"
"appl" "app" Yes {"a","ap","app","appl"} "appl"
"apply" "appl" Yes {"a","ap","app","appl","apply"} "apply"
"banana" "banan" No unchanged "apply"

At first glance, it seems "apple" was skipped because "appl" had not yet been processed. This is why sorting order matters carefully.

A better sorted order is:

["a","ap","app","appl","apple","apply","banana"]

Then:

Current Word Prefix Needed Valid? Answer
"a" none Yes "a"
"ap" "a" Yes "ap"
"app" "ap" Yes "app"
"appl" "app" Yes "appl"
"apple" "appl" Yes "apple"
"apply" "appl" Yes "apple"

Both "apple" and "apply" are length 5, but "apple" is lexicographically smaller, so it remains the answer.

Complexity Analysis

Measure Complexity Explanation
Time O(n log n + n × m) Sorting dominates initially, then each word performs one prefix lookup
Space O(n) The buildable set stores valid words

The sorting step costs O(n log n).

After sorting, each word performs:

  • One substring operation
  • One hash lookup
  • Possibly one insertion

Each substring operation costs up to O(m) where m is the maximum word length.

Therefore, the total processing cost is O(n × m) after sorting.

The auxiliary space is O(n) because the set may store every word.

Test Cases

from typing import List

class Solution:
    def longestWord(self, words: List[str]) -> str:
        words.sort()

        buildable = set()
        answer = ""

        for word in words:
            if len(word) == 1 or word[:-1] in buildable:
                buildable.add(word)

                if len(word) > len(answer):
                    answer = word

        return answer

solution = Solution()

assert solution.longestWord(
    ["w", "wo", "wor", "worl", "world"]
) == "world"  # basic incremental chain

assert solution.longestWord(
    ["a", "banana", "app", "appl", "ap", "apply", "apple"]
) == "apple"  # lexicographical tie breaking

assert solution.longestWord(
    ["a", "b", "ba", "bca", "bda", "bdca"]
) == "ba"  # longer words missing prefixes

assert solution.longestWord(
    ["x"]
) == "x"  # single word input

assert solution.longestWord(
    ["abc", "ab", "a"]
) == "abc"  # complete prefix chain exists

assert solution.longestWord(
    ["abc"]
) == ""  # missing required prefixes

assert solution.longestWord(
    ["a", "ar", "art", "arti", "artic", "articl", "article"]
) == "article"  # long valid chain

assert solution.longestWord(
    ["a", "banana", "ban", "b"]
) == "a"  # invalid long words ignored

assert solution.longestWord(
    ["k", "ki", "kir", "kira", "kiran", "kiro"]
) == "kiran"  # longest valid chain wins

assert solution.longestWord(
    ["a", "ap", "app", "appl", "apple", "apply"]
) == "apple"  # equal length lexicographical comparison

Test Case Summary

Test Why
["w","wo","wor","worl","world"] Standard successful build chain
["a","banana","app","appl","ap","apply","apple"] Lexicographical tie handling
["a","b","ba","bca","bda","bdca"] Invalid longer chains
["x"] Smallest possible input
["abc","ab","a"] Full prefix hierarchy
["abc"] No valid prefixes
["a","ar","art","arti","artic","articl","article"] Long valid sequence
["a","banana","ban","b"] Long invalid word ignored
["k","ki","kir","kira","kiran","kiro"] Longest valid chain selection
["a","ap","app","appl","apple","apply"] Equal length tie resolution

Edge Cases

Words Without Complete Prefix Chains

A very common mistake is checking only whether some prefixes exist instead of requiring all prefixes.

For example:

["banana", "ban", "ba", "b"]

The word "banana" is invalid because prefixes like "bana" and "banan" do not exist.

The implementation avoids this bug by requiring word[:-1] to already be buildable. Since buildability propagates incrementally, every earlier prefix is guaranteed to exist as well.

Lexicographical Tie Breaking

Another tricky case occurs when multiple valid words share the same maximum length.

For example:

["apple", "apply"]

Both may be valid, but the problem requires the lexicographically smaller word.

Sorting the array lexicographically before processing naturally guarantees this behavior. Since "apple" appears before "apply", and we only replace the answer when finding a strictly longer word, "apple" remains the final answer.

Missing One Character Prefix

A word may appear almost valid but fail because exactly one prefix is missing.

Example:

["a", "ap", "apple"]

The word "apple" requires "appl" to exist, but it does not.

A naive implementation might incorrectly assume earlier prefixes are enough. Our solution correctly checks the immediate predecessor prefix:

word[:-1]

Since "appl" is absent, "apple" is rejected immediately.