LeetCode 524 - Longest Word in Dictionary through Deleting

The problem gives us a source string s and a list of candidate words called dictionary. We must determine which dictionary word can be formed by deleting characters from s without changing the relative order of the remaining characters.

LeetCode Problem 524

Difficulty: 🟡 Medium
Topics: Array, Two Pointers, String, Sorting

Solution

Problem Understanding

The problem gives us a source string s and a list of candidate words called dictionary. We must determine which dictionary word can be formed by deleting characters from s without changing the relative order of the remaining characters.

This means we are looking for a subsequence relationship. A word is valid if we can scan through s from left to right and match all characters of the word in order. The characters do not need to be contiguous, but their ordering must remain the same.

For example, if s = "abpcplea" and the candidate word is "apple", we can form it by selecting the characters:

  • a
  • p
  • p
  • l
  • e

from the original string while skipping unrelated characters.

The problem does not simply ask for any valid word. It asks for the best word according to two rules:

  1. Prefer the longest valid word.
  2. If multiple valid words have the same length, choose the lexicographically smallest one.

Lexicographical order is standard dictionary order. For example, "apple" is smaller than "apply" because the first differing character is 'e', which comes before 'y'.

The constraints are relatively small:

  • s.length <= 1000
  • dictionary.length <= 1000
  • Each dictionary word length <= 1000`

These limits strongly suggest that checking each dictionary word individually is feasible. A solution with roughly O(dictionary_size × string_length) complexity is acceptable.

Several edge cases are important:

  • No dictionary word can be formed, in which case we return "".
  • Multiple valid words have the same maximum length, requiring lexicographical comparison.
  • Very short strings, including single-character inputs.
  • Dictionary words longer than s, which can never be valid subsequences.
  • Duplicate or highly similar words that may expose tie-breaking bugs.

Approaches

Brute Force Approach

The brute-force idea is to generate every possible subsequence of s and check whether each subsequence exists in the dictionary.

A string of length n has 2^n possible subsequences because every character can either be included or excluded. After generating all subsequences, we could compare them against the dictionary and select the best valid answer according to the problem rules.

This approach is correct because it exhaustively explores all possible deletions from s. Any valid dictionary word must appear among the generated subsequences.

However, this method becomes completely impractical even for moderate input sizes. Since s.length can be up to 1000, generating 2^1000 subsequences is impossible.

Optimal Approach

The key observation is that we do not need to generate subsequences of s. Instead, we can reverse the process and test whether each dictionary word is a subsequence of s.

To check whether a word is a subsequence, we use the two pointers technique:

  • One pointer scans s
  • One pointer scans the candidate word

Whenever characters match, we advance both pointers. Otherwise, we only advance the pointer in s.

If we successfully consume all characters in the candidate word, then it is a valid subsequence.

While checking each dictionary word, we maintain the current best answer. We update it whenever:

  • The new word is longer, or
  • The lengths are equal and the new word is lexicographically smaller

This solution is efficient because each subsequence check runs in linear time relative to the lengths of the strings involved.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^n) O(2^n) Generates all subsequences of s
Optimal O(d × (n + m)) O(1) Checks each dictionary word using two pointers

Here:

  • n = len(s)
  • d = len(dictionary)
  • m = average dictionary word length

Algorithm Walkthrough

  1. Initialize an empty string called bestWord.

This variable stores the best valid dictionary word found so far. 2. Iterate through every word in dictionary.

We must examine every candidate because any word could potentially be the longest valid subsequence. 3. For each word, check whether it is a subsequence of s.

Use two pointers:

  • i points to characters in s
  • j points to characters in the candidate word

Traverse s from left to right. 4. Whenever s[i] == word[j], advance both pointers.

This means we successfully matched one more character of the candidate word. 5. Otherwise, advance only the pointer in s.

We are effectively deleting that character from consideration. 6. After finishing the scan, check whether j == len(word).

If true, every character in the candidate word was matched in order, so the word is a valid subsequence. 7. Compare the valid word with bestWord.

Update bestWord if:

  • The current word is longer, or
  • The lengths are equal and the current word is lexicographically smaller
  1. After processing all dictionary words, return bestWord.

Why it works

The algorithm works because the two pointer subsequence check exactly models the allowed deletion process. By scanning s once and matching characters in order, we verify whether a dictionary word can be formed without rearranging characters.

Since every dictionary word is checked independently, no valid candidate is missed. The update rules guarantee that the stored answer is always the longest valid word, with lexicographical order used as the tie breaker.

Python Solution

from typing import List

class Solution:
    def findLongestWord(self, s: str, dictionary: List[str]) -> str:
        
        def is_subsequence(word: str, source: str) -> bool:
            word_index = 0
            source_index = 0

            while word_index < len(word) and source_index < len(source):
                if word[word_index] == source[source_index]:
                    word_index += 1

                source_index += 1

            return word_index == len(word)

        best_word = ""

        for word in dictionary:
            if is_subsequence(word, s):
                if len(word) > len(best_word):
                    best_word = word
                elif len(word) == len(best_word) and word < best_word:
                    best_word = word

        return best_word

The implementation begins with a helper function called is_subsequence. This function performs the two pointer scan between the candidate word and the source string s.

The variable word_index tracks progress through the dictionary word, while source_index scans through s.

Whenever characters match, we advance word_index, meaning one more character of the candidate has been successfully formed.

At the end of the scan, if word_index reached the full length of the candidate word, then every character was matched in order.

The main loop processes each dictionary word one at a time. Whenever a valid subsequence is found, the code compares it with the current best answer.

The update conditions exactly implement the problem requirements:

  • Prefer longer words
  • Use lexicographical ordering for ties

The final value of best_word is returned after all candidates are checked.

Go Solution

func findLongestWord(s string, dictionary []string) string {

	isSubsequence := func(word string, source string) bool {
		wordIndex := 0
		sourceIndex := 0

		for wordIndex < len(word) && sourceIndex < len(source) {
			if word[wordIndex] == source[sourceIndex] {
				wordIndex++
			}

			sourceIndex++
		}

		return wordIndex == len(word)
	}

	bestWord := ""

	for _, word := range dictionary {
		if isSubsequence(word, s) {

			if len(word) > len(bestWord) {
				bestWord = word
			} else if len(word) == len(bestWord) && word < bestWord {
				bestWord = word
			}
		}
	}

	return bestWord
}

The Go implementation follows the same logic as the Python version. A local helper function checks subsequence validity using two pointers.

One difference is that Go strings are indexed as byte slices. Since the problem guarantees lowercase English letters only, direct byte indexing is completely safe.

The empty string "" naturally handles the case where no valid dictionary word exists.

No additional memory allocations or complex data structures are required, so the implementation remains efficient and straightforward.

Worked Examples

Example 1

Input:
s = "abpcplea"
dictionary = ["ale","apple","monkey","plea"]

We process each dictionary word one at a time.

Checking "ale"

Step s Character Word Character Match word_index
1 a a Yes 1
2 b l No 1
3 p l No 1
4 c l No 1
5 p l No 1
6 l l Yes 2
7 e e Yes 3

All characters matched, so "ale" is valid.

Current best word:

"ale"

Checking "apple"

Step s Character Word Character Match word_index
1 a a Yes 1
2 b p No 1
3 p p Yes 2
4 c p No 2
5 p p Yes 3
6 l l Yes 4
7 e e Yes 5

"apple" is valid and longer than "ale".

Current best word:

"apple"

Checking "monkey"

The first character 'm' never appears in s.

Result:

Invalid subsequence

Checking "plea"

The word is valid, but its length is smaller than "apple".

Final answer:

"apple"

Example 2

Input:
s = "abpcplea"
dictionary = ["a","b","c"]

Checking "a"

The first character matches immediately.

Current best word:

"a"

Checking "b"

"b" is also valid, but "a" and "b" have equal length.

Lexicographically:

"a" < "b"

So "a" remains the answer.

Checking "c"

Same reasoning applies.

Final answer:

"a"

Complexity Analysis

Measure Complexity Explanation
Time O(d × (n + m)) Each dictionary word is checked against s using two pointers
Space O(1) Only pointer variables and the answer string are used

Here:

  • d is the number of dictionary words
  • n is the length of s
  • m is the average dictionary word length

For each dictionary word, we scan through s at most once. Since the constraints are only up to 1000, this solution easily fits within acceptable limits.

Test Cases

solution = Solution()

assert solution.findLongestWord(
    "abpcplea",
    ["ale", "apple", "monkey", "plea"]
) == "apple"  # basic example with longest valid word

assert solution.findLongestWord(
    "abpcplea",
    ["a", "b", "c"]
) == "a"  # lexicographical tie breaking

assert solution.findLongestWord(
    "abc",
    ["d", "e", "f"]
) == ""  # no valid subsequence exists

assert solution.findLongestWord(
    "abcde",
    ["a", "ab", "abc"]
) == "abc"  # longest subsequence chosen

assert solution.findLongestWord(
    "bab",
    ["ba", "ab", "a", "b"]
) == "ab"  # equal length tie resolved lexicographically

assert solution.findLongestWord(
    "aaaaa",
    ["aaa", "aaaa", "aaaaa"]
) == "aaaaa"  # exact full-string match

assert solution.findLongestWord(
    "abc",
    ["abcd", "abcde"]
) == ""  # dictionary words longer than source

assert solution.findLongestWord(
    "xyz",
    ["x", "xy", "xyz"]
) == "xyz"  # entire string is subsequence

assert solution.findLongestWord(
    "leetcode",
    ["leet", "code", "leetcode"]
) == "leetcode"  # full word available

assert solution.findLongestWord(
    "a",
    ["a", "b"]
) == "a"  # smallest possible valid input
Test Why
"abpcplea" with "apple" Standard example
["a","b","c"] Lexicographical tie handling
No valid words Ensures empty string return
Increasing subsequence sizes Confirms longest word selection
"ba" vs "ab" Tie-breaking correctness
Repeated characters Tests repeated matching logic
Words longer than source Confirms impossible candidates are rejected
Full string match Validates exact subsequence handling
"leetcode" exact match Ensures complete match works
Single-character input Boundary condition

Edge Cases

No Valid Dictionary Word

A common bug is forgetting to handle the situation where no dictionary word can be formed from s. In this case, the correct answer is the empty string.

The implementation handles this naturally because best_word starts as "" and is only updated when a valid subsequence is found.

Multiple Words With Same Length

Tie-breaking is easy to implement incorrectly. Suppose both "ab" and "ba" are valid subsequences with equal length. The problem requires returning the lexicographically smaller word.

The implementation explicitly checks:

elif len(word) == len(best_word) and word < best_word:

This guarantees the correct dictionary ordering behavior.

Dictionary Words Longer Than s

A dictionary word longer than the source string can never be a subsequence. Some implementations attempt unnecessary processing or produce index errors.

The two pointer logic handles this safely. Since the scan over s finishes before all characters in the candidate word are matched, the function correctly returns False.

Repeated Characters

Strings with repeated characters can expose subtle pointer bugs. For example:

s = "aaaaa"
word = "aaaa"

The implementation advances the candidate pointer only when characters match, ensuring repeated characters are matched one at a time in the correct order.