LeetCode 524 - Longest Word in Dictionary through Deleting
The problem gives us a source string s and a list of candidate words called dictionary. We must determine which dictionary word can be formed by deleting characters from s without changing the relative order of the remaining characters.
Difficulty: 🟡 Medium
Topics: Array, Two Pointers, String, Sorting
Solution
Problem Understanding
The problem gives us a source string s and a list of candidate words called dictionary. We must determine which dictionary word can be formed by deleting characters from s without changing the relative order of the remaining characters.
This means we are looking for a subsequence relationship. A word is valid if we can scan through s from left to right and match all characters of the word in order. The characters do not need to be contiguous, but their ordering must remain the same.
For example, if s = "abpcplea" and the candidate word is "apple", we can form it by selecting the characters:
apple
from the original string while skipping unrelated characters.
The problem does not simply ask for any valid word. It asks for the best word according to two rules:
- Prefer the longest valid word.
- If multiple valid words have the same length, choose the lexicographically smallest one.
Lexicographical order is standard dictionary order. For example, "apple" is smaller than "apply" because the first differing character is 'e', which comes before 'y'.
The constraints are relatively small:
s.length <= 1000dictionary.length <= 1000- Each dictionary word length <= 1000`
These limits strongly suggest that checking each dictionary word individually is feasible. A solution with roughly O(dictionary_size × string_length) complexity is acceptable.
Several edge cases are important:
- No dictionary word can be formed, in which case we return
"". - Multiple valid words have the same maximum length, requiring lexicographical comparison.
- Very short strings, including single-character inputs.
- Dictionary words longer than
s, which can never be valid subsequences. - Duplicate or highly similar words that may expose tie-breaking bugs.
Approaches
Brute Force Approach
The brute-force idea is to generate every possible subsequence of s and check whether each subsequence exists in the dictionary.
A string of length n has 2^n possible subsequences because every character can either be included or excluded. After generating all subsequences, we could compare them against the dictionary and select the best valid answer according to the problem rules.
This approach is correct because it exhaustively explores all possible deletions from s. Any valid dictionary word must appear among the generated subsequences.
However, this method becomes completely impractical even for moderate input sizes. Since s.length can be up to 1000, generating 2^1000 subsequences is impossible.
Optimal Approach
The key observation is that we do not need to generate subsequences of s. Instead, we can reverse the process and test whether each dictionary word is a subsequence of s.
To check whether a word is a subsequence, we use the two pointers technique:
- One pointer scans
s - One pointer scans the candidate word
Whenever characters match, we advance both pointers. Otherwise, we only advance the pointer in s.
If we successfully consume all characters in the candidate word, then it is a valid subsequence.
While checking each dictionary word, we maintain the current best answer. We update it whenever:
- The new word is longer, or
- The lengths are equal and the new word is lexicographically smaller
This solution is efficient because each subsequence check runs in linear time relative to the lengths of the strings involved.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n) | O(2^n) | Generates all subsequences of s |
| Optimal | O(d × (n + m)) | O(1) | Checks each dictionary word using two pointers |
Here:
n = len(s)d = len(dictionary)m = average dictionary word length
Algorithm Walkthrough
- Initialize an empty string called
bestWord.
This variable stores the best valid dictionary word found so far.
2. Iterate through every word in dictionary.
We must examine every candidate because any word could potentially be the longest valid subsequence.
3. For each word, check whether it is a subsequence of s.
Use two pointers:
ipoints to characters insjpoints to characters in the candidate word
Traverse s from left to right.
4. Whenever s[i] == word[j], advance both pointers.
This means we successfully matched one more character of the candidate word.
5. Otherwise, advance only the pointer in s.
We are effectively deleting that character from consideration.
6. After finishing the scan, check whether j == len(word).
If true, every character in the candidate word was matched in order, so the word is a valid subsequence.
7. Compare the valid word with bestWord.
Update bestWord if:
- The current word is longer, or
- The lengths are equal and the current word is lexicographically smaller
- After processing all dictionary words, return
bestWord.
Why it works
The algorithm works because the two pointer subsequence check exactly models the allowed deletion process. By scanning s once and matching characters in order, we verify whether a dictionary word can be formed without rearranging characters.
Since every dictionary word is checked independently, no valid candidate is missed. The update rules guarantee that the stored answer is always the longest valid word, with lexicographical order used as the tie breaker.
Python Solution
from typing import List
class Solution:
def findLongestWord(self, s: str, dictionary: List[str]) -> str:
def is_subsequence(word: str, source: str) -> bool:
word_index = 0
source_index = 0
while word_index < len(word) and source_index < len(source):
if word[word_index] == source[source_index]:
word_index += 1
source_index += 1
return word_index == len(word)
best_word = ""
for word in dictionary:
if is_subsequence(word, s):
if len(word) > len(best_word):
best_word = word
elif len(word) == len(best_word) and word < best_word:
best_word = word
return best_word
The implementation begins with a helper function called is_subsequence. This function performs the two pointer scan between the candidate word and the source string s.
The variable word_index tracks progress through the dictionary word, while source_index scans through s.
Whenever characters match, we advance word_index, meaning one more character of the candidate has been successfully formed.
At the end of the scan, if word_index reached the full length of the candidate word, then every character was matched in order.
The main loop processes each dictionary word one at a time. Whenever a valid subsequence is found, the code compares it with the current best answer.
The update conditions exactly implement the problem requirements:
- Prefer longer words
- Use lexicographical ordering for ties
The final value of best_word is returned after all candidates are checked.
Go Solution
func findLongestWord(s string, dictionary []string) string {
isSubsequence := func(word string, source string) bool {
wordIndex := 0
sourceIndex := 0
for wordIndex < len(word) && sourceIndex < len(source) {
if word[wordIndex] == source[sourceIndex] {
wordIndex++
}
sourceIndex++
}
return wordIndex == len(word)
}
bestWord := ""
for _, word := range dictionary {
if isSubsequence(word, s) {
if len(word) > len(bestWord) {
bestWord = word
} else if len(word) == len(bestWord) && word < bestWord {
bestWord = word
}
}
}
return bestWord
}
The Go implementation follows the same logic as the Python version. A local helper function checks subsequence validity using two pointers.
One difference is that Go strings are indexed as byte slices. Since the problem guarantees lowercase English letters only, direct byte indexing is completely safe.
The empty string "" naturally handles the case where no valid dictionary word exists.
No additional memory allocations or complex data structures are required, so the implementation remains efficient and straightforward.
Worked Examples
Example 1
Input:
s = "abpcplea"
dictionary = ["ale","apple","monkey","plea"]
We process each dictionary word one at a time.
Checking "ale"
| Step | s Character | Word Character | Match | word_index |
|---|---|---|---|---|
| 1 | a | a | Yes | 1 |
| 2 | b | l | No | 1 |
| 3 | p | l | No | 1 |
| 4 | c | l | No | 1 |
| 5 | p | l | No | 1 |
| 6 | l | l | Yes | 2 |
| 7 | e | e | Yes | 3 |
All characters matched, so "ale" is valid.
Current best word:
"ale"
Checking "apple"
| Step | s Character | Word Character | Match | word_index |
|---|---|---|---|---|
| 1 | a | a | Yes | 1 |
| 2 | b | p | No | 1 |
| 3 | p | p | Yes | 2 |
| 4 | c | p | No | 2 |
| 5 | p | p | Yes | 3 |
| 6 | l | l | Yes | 4 |
| 7 | e | e | Yes | 5 |
"apple" is valid and longer than "ale".
Current best word:
"apple"
Checking "monkey"
The first character 'm' never appears in s.
Result:
Invalid subsequence
Checking "plea"
The word is valid, but its length is smaller than "apple".
Final answer:
"apple"
Example 2
Input:
s = "abpcplea"
dictionary = ["a","b","c"]
Checking "a"
The first character matches immediately.
Current best word:
"a"
Checking "b"
"b" is also valid, but "a" and "b" have equal length.
Lexicographically:
"a" < "b"
So "a" remains the answer.
Checking "c"
Same reasoning applies.
Final answer:
"a"
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(d × (n + m)) | Each dictionary word is checked against s using two pointers |
| Space | O(1) | Only pointer variables and the answer string are used |
Here:
dis the number of dictionary wordsnis the length ofsmis the average dictionary word length
For each dictionary word, we scan through s at most once. Since the constraints are only up to 1000, this solution easily fits within acceptable limits.
Test Cases
solution = Solution()
assert solution.findLongestWord(
"abpcplea",
["ale", "apple", "monkey", "plea"]
) == "apple" # basic example with longest valid word
assert solution.findLongestWord(
"abpcplea",
["a", "b", "c"]
) == "a" # lexicographical tie breaking
assert solution.findLongestWord(
"abc",
["d", "e", "f"]
) == "" # no valid subsequence exists
assert solution.findLongestWord(
"abcde",
["a", "ab", "abc"]
) == "abc" # longest subsequence chosen
assert solution.findLongestWord(
"bab",
["ba", "ab", "a", "b"]
) == "ab" # equal length tie resolved lexicographically
assert solution.findLongestWord(
"aaaaa",
["aaa", "aaaa", "aaaaa"]
) == "aaaaa" # exact full-string match
assert solution.findLongestWord(
"abc",
["abcd", "abcde"]
) == "" # dictionary words longer than source
assert solution.findLongestWord(
"xyz",
["x", "xy", "xyz"]
) == "xyz" # entire string is subsequence
assert solution.findLongestWord(
"leetcode",
["leet", "code", "leetcode"]
) == "leetcode" # full word available
assert solution.findLongestWord(
"a",
["a", "b"]
) == "a" # smallest possible valid input
| Test | Why |
|---|---|
"abpcplea" with "apple" |
Standard example |
["a","b","c"] |
Lexicographical tie handling |
| No valid words | Ensures empty string return |
| Increasing subsequence sizes | Confirms longest word selection |
"ba" vs "ab" |
Tie-breaking correctness |
| Repeated characters | Tests repeated matching logic |
| Words longer than source | Confirms impossible candidates are rejected |
| Full string match | Validates exact subsequence handling |
"leetcode" exact match |
Ensures complete match works |
| Single-character input | Boundary condition |
Edge Cases
No Valid Dictionary Word
A common bug is forgetting to handle the situation where no dictionary word can be formed from s. In this case, the correct answer is the empty string.
The implementation handles this naturally because best_word starts as "" and is only updated when a valid subsequence is found.
Multiple Words With Same Length
Tie-breaking is easy to implement incorrectly. Suppose both "ab" and "ba" are valid subsequences with equal length. The problem requires returning the lexicographically smaller word.
The implementation explicitly checks:
elif len(word) == len(best_word) and word < best_word:
This guarantees the correct dictionary ordering behavior.
Dictionary Words Longer Than s
A dictionary word longer than the source string can never be a subsequence. Some implementations attempt unnecessary processing or produce index errors.
The two pointer logic handles this safely. Since the scan over s finishes before all characters in the candidate word are matched, the function correctly returns False.
Repeated Characters
Strings with repeated characters can expose subtle pointer bugs. For example:
s = "aaaaa"
word = "aaaa"
The implementation advances the candidate pointer only when characters match, ensuring repeated characters are matched one at a time in the correct order.