LeetCode 966 - Vowel Spellchecker
The problem asks us to build a spellchecker with three levels of matching priority. For every query word, we must search the wordlist and return the best matching word according to a strict precedence order. The first and highest priority rule is an exact match.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String
Solution
Problem Understanding
The problem asks us to build a spellchecker with three levels of matching priority. For every query word, we must search the wordlist and return the best matching word according to a strict precedence order.
The first and highest priority rule is an exact match. If the query exactly matches a word in the wordlist, including the same capitalization, we immediately return that exact word.
If there is no exact match, we try a case-insensitive match. This means words are considered equal if their lowercase forms are the same. When multiple words satisfy this rule, we must return the first one that appeared in the original wordlist.
If there is still no match, we try a vowel-error match. In this rule, vowels are treated as interchangeable. After converting both words to lowercase, every vowel character (a, e, i, o, u) is considered equivalent. For example:
-
"KiTe"and"keti"match because: -
lowercase forms are
"kite"and"keti" -
replacing vowels with a placeholder produces the same normalized form
Again, if multiple words match, we return the first occurrence from the original wordlist.
If none of the rules produce a match, we return an empty string.
The input consists of:
wordlist, the dictionary of valid wordsqueries, the list of words we must correct
The output is a list where each query is replaced by the best matching word according to the precedence rules.
The constraints are relatively small:
- Up to 5000 words
- Each word length up to 7
Even though the input size is not enormous, a naive implementation that repeatedly scans the entire wordlist for every query can still become inefficient. Since each query may require several kinds of comparisons, preprocessing the dictionary into efficient lookup structures is the natural optimization.
There are several important edge cases:
- Multiple words may map to the same lowercase form, such as
"KiTe"and"kite". We must keep the first occurrence. - Multiple words may map to the same vowel-normalized form. Again, the first occurrence must be preserved.
- Exact matches must always override case-insensitive or vowel matches.
- Queries with missing or extra characters cannot match, even if vowels differ.
- Words with no vowels should still work correctly.
Approaches
Brute Force Approach
The simplest solution is to process each query independently and scan the entire wordlist multiple times.
For every query:
- First scan for an exact match.
- If none exists, scan again for a case-insensitive match.
- If none exists, scan again for a vowel-normalized match.
To implement vowel matching, we can normalize words by:
- converting to lowercase
- replacing every vowel with a common placeholder such as
'*'
For example:
"KiTe"→"k*t*""keti"→"k*t*"
This brute-force approach is correct because it explicitly checks the rules in priority order and returns the first valid match.
However, it is inefficient because every query may require scanning the entire wordlist multiple times. With up to 5000 queries and 5000 dictionary words, this becomes unnecessarily expensive.
Optimal Approach
The key observation is that the matching rules depend on deterministic transformations of the words.
We can preprocess the wordlist into three lookup structures:
- A set for exact matches
- A hash map from lowercase word → first matching dictionary word
- A hash map from vowel-normalized word → first matching dictionary word
Then each query can be answered in constant time lookups.
The crucial detail is preserving the first occurrence from the original wordlist. We achieve this by only inserting into the hash maps if the key does not already exist.
The preprocessing transforms repeated scanning into efficient hash lookups.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(Q × W × L) | O(1) | Repeatedly scans the wordlist for every query |
| Optimal | O(W × L + Q × L) | O(W × L) | Uses preprocessing with hash maps for fast lookup |
Where:
Q= number of queriesW= number of words in wordlistL= maximum word length
Algorithm Walkthrough
Step 1: Create an Exact Match Set
We store every word from wordlist in a hash set.
This allows constant time checking for the highest-priority rule, exact case-sensitive matches.
For example:
{"KiTe", "kite", "hare", "Hare"}
Step 2: Build a Case-Insensitive Map
For every word in the wordlist:
- Convert it to lowercase
- Store:
- lowercase word → original word
We only insert if the lowercase key does not already exist.
This preserves the first occurrence rule.
Example:
| Word | Lowercase Key | Stored Value |
|---|---|---|
| KiTe | kite | KiTe |
| kite | kite | ignored |
| hare | hare | hare |
| Hare | hare | ignored |
Final map:
{
"kite": "KiTe",
"hare": "hare"
}
Step 3: Build a Vowel-Normalized Map
We define a normalization function:
- Convert the word to lowercase
- Replace every vowel with
'*'
Examples:
| Word | Normalized |
|---|---|
| KiTe | k_t_ |
| keto | k_t_ |
| Hare | h_r_ |
Again, we only store the first occurrence.
Example map:
{
"k*t*": "KiTe",
"h*r*": "hare"
}
Step 4: Process Each Query
For every query:
- Check exact match
- Check lowercase match
- Check vowel-normalized match
- Otherwise return
""
The order is critical because the problem specifies strict precedence rules.
Step 5: Collect Results
Append the chosen result for each query into the answer list.
Why it works
The algorithm works because every matching rule corresponds to a deterministic transformation:
- Exact matching uses the original word
- Case-insensitive matching uses lowercase normalization
- Vowel-error matching uses vowel normalization
By preprocessing the dictionary into hash maps keyed by these normalized forms, we can directly retrieve the correct first matching word. Since queries are checked in strict priority order, the returned result always follows the required precedence rules.
Python Solution
from typing import List
class Solution:
def spellchecker(self, wordlist: List[str], queries: List[str]) -> List[str]:
vowels = set("aeiou")
def normalize(word: str) -> str:
chars = []
for ch in word.lower():
if ch in vowels:
chars.append("*")
else:
chars.append(ch)
return "".join(chars)
exact_words = set(wordlist)
case_map = {}
vowel_map = {}
for word in wordlist:
lower_word = word.lower()
normalized_word = normalize(word)
if lower_word not in case_map:
case_map[lower_word] = word
if normalized_word not in vowel_map:
vowel_map[normalized_word] = word
result = []
for query in queries:
if query in exact_words:
result.append(query)
continue
lower_query = query.lower()
if lower_query in case_map:
result.append(case_map[lower_query])
continue
normalized_query = normalize(query)
if normalized_query in vowel_map:
result.append(vowel_map[normalized_query])
continue
result.append("")
return result
The implementation begins by defining a normalization helper function. This function converts a word to lowercase and replaces every vowel with '*'. Using a shared placeholder ensures that all vowel variations map to the same normalized representation.
The exact_words set supports constant time exact matching.
The case_map dictionary stores the first word associated with each lowercase form. The condition:
if lower_word not in case_map:
is essential because the problem requires preserving the first matching word from the original wordlist.
The vowel_map dictionary works similarly, but uses the vowel-normalized representation as the key.
When processing queries, the implementation checks the rules in exact precedence order:
- Exact match
- Case-insensitive match
- Vowel-error match
As soon as a match is found, the corresponding word is appended to the result list.
Go Solution
package main
import "strings"
func spellchecker(wordlist []string, queries []string) []string {
vowels := map[byte]bool{
'a': true,
'e': true,
'i': true,
'o': true,
'u': true,
}
normalize := func(word string) string {
word = strings.ToLower(word)
chars := []byte(word)
for i := 0; i < len(chars); i++ {
if vowels[chars[i]] {
chars[i] = '*'
}
}
return string(chars)
}
exactWords := make(map[string]bool)
caseMap := make(map[string]string)
vowelMap := make(map[string]string)
for _, word := range wordlist {
exactWords[word] = true
lowerWord := strings.ToLower(word)
normalizedWord := normalize(word)
if _, exists := caseMap[lowerWord]; !exists {
caseMap[lowerWord] = word
}
if _, exists := vowelMap[normalizedWord]; !exists {
vowelMap[normalizedWord] = word
}
}
result := make([]string, 0, len(queries))
for _, query := range queries {
if exactWords[query] {
result = append(result, query)
continue
}
lowerQuery := strings.ToLower(query)
if word, exists := caseMap[lowerQuery]; exists {
result = append(result, word)
continue
}
normalizedQuery := normalize(query)
if word, exists := vowelMap[normalizedQuery]; exists {
result = append(result, word)
continue
}
result = append(result, "")
}
return result
}
The Go solution follows the same algorithmic structure as the Python version.
A few Go-specific details are worth noting:
- Go does not have a built-in set type, so
map[string]boolis used for exact matches. - Strings are immutable in Go, so normalization converts the string into a byte slice before modifying characters.
- The
existsidiom is used to preserve first occurrences in the hash maps. - The result slice is initialized with capacity equal to the number of queries for efficiency.
Worked Examples
Example 1
wordlist = ["KiTe","kite","hare","Hare"]
queries = [
"kite","Kite","KiTe","Hare",
"HARE","Hear","hear","keti",
"keet","keto"
]
Preprocessing
Exact Set
| Word |
|---|
| KiTe |
| kite |
| hare |
| Hare |
Case Map
| Lowercase | Stored Word |
|---|---|
| kite | KiTe |
| hare | hare |
Vowel Map
| Normalized | Stored Word |
|---|---|
| k_t_ | KiTe |
| h_r_ | hare |
Query Processing
| Query | Exact Match | Case Match | Vowel Match | Result |
|---|---|---|---|---|
| kite | yes | - | - | kite |
| Kite | no | KiTe | - | KiTe |
| KiTe | yes | - | - | KiTe |
| Hare | yes | - | - | Hare |
| HARE | no | hare | - | hare |
| Hear | no | no | no | "" |
| hear | no | no | no | "" |
| keti | no | no | KiTe | KiTe |
| keet | no | no | no | "" |
| keto | no | no | KiTe | KiTe |
Final output:
["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]
Example 2
wordlist = ["yellow"]
queries = ["YellOw"]
Preprocessing
Exact Set
{"yellow"}
Case Map
| Lowercase | Stored Word |
|---|---|
| yellow | yellow |
Vowel Map
| Normalized | Stored Word |
|---|---|
| y_ll_w | yellow |
Query Processing
| Query | Exact Match | Case Match | Result |
|---|---|---|---|
| YellOw | no | yellow | yellow |
Final output:
["yellow"]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(W × L + Q × L) | Each word and query is processed a constant number of times |
| Space | O(W × L) | Hash maps and sets store transformed versions of the words |
The preprocessing phase processes every dictionary word once, and each normalization takes O(L) time where L is the word length.
Each query also performs a constant number of hash lookups and at most one normalization operation, resulting in O(L) time per query.
The space usage comes from storing:
- the exact set
- the lowercase map
- the vowel-normalized map
All together, these structures store at most one transformed representation per word.
Test Cases
from typing import List
class Solution:
def spellchecker(self, wordlist: List[str], queries: List[str]) -> List[str]:
vowels = set("aeiou")
def normalize(word: str) -> str:
chars = []
for ch in word.lower():
if ch in vowels:
chars.append("*")
else:
chars.append(ch)
return "".join(chars)
exact_words = set(wordlist)
case_map = {}
vowel_map = {}
for word in wordlist:
lower_word = word.lower()
normalized_word = normalize(word)
if lower_word not in case_map:
case_map[lower_word] = word
if normalized_word not in vowel_map:
vowel_map[normalized_word] = word
result = []
for query in queries:
if query in exact_words:
result.append(query)
elif query.lower() in case_map:
result.append(case_map[query.lower()])
elif normalize(query) in vowel_map:
result.append(vowel_map[normalize(query)])
else:
result.append("")
return result
sol = Solution()
# Provided example 1
assert sol.spellchecker(
["KiTe","kite","hare","Hare"],
["kite","Kite","KiTe","Hare","HARE","Hear","hear","keti","keet","keto"]
) == ["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]
# Provided example 2
assert sol.spellchecker(
["yellow"],
["YellOw"]
) == ["yellow"]
# Exact match should override all others
assert sol.spellchecker(
["KiTe"],
["KiTe"]
) == ["KiTe"]
# Case insensitive match
assert sol.spellchecker(
["Yellow"],
["yellow"]
) == ["Yellow"]
# Vowel error match
assert sol.spellchecker(
["YellOw"],
["yollow"]
) == ["YellOw"]
# No match due to missing letters
assert sol.spellchecker(
["YellOw"],
["yllw"]
) == [""]
# No match due to extra letters
assert sol.spellchecker(
["YellOw"],
["yeellow"]
) == [""]
# First occurrence must be preserved
assert sol.spellchecker(
["KiTe", "kite"],
["KITE"]
) == ["KiTe"]
# Multiple vowel variants
assert sol.spellchecker(
["banana"],
["benene", "binini", "bonono"]
) == ["banana", "banana", "banana"]
# Words without vowels
assert sol.spellchecker(
["rhythm"],
["RHYTHM"]
) == ["rhythm"]
# Empty result for unrelated word
assert sol.spellchecker(
["apple"],
["orange"]
) == [""]
print("All tests passed!")
| Test | Why |
|---|---|
| Provided example 1 | Validates all precedence rules together |
| Provided example 2 | Validates case-insensitive matching |
| Exact match override | Ensures highest priority rule works |
| Case insensitive match | Confirms lowercase normalization |
| Vowel error match | Confirms vowel normalization |
| Missing letters | Ensures length differences do not match |
| Extra letters | Prevents incorrect partial matches |
| First occurrence preservation | Validates insertion logic in maps |
| Multiple vowel variants | Confirms all vowels are interchangeable |
| Words without vowels | Ensures consonant-only words work |
| Unrelated word | Ensures empty string is returned correctly |
Edge Cases
Multiple Words Mapping to the Same Lowercase Form
A common source of bugs is overwriting earlier entries in the case-insensitive map.
For example:
wordlist = ["KiTe", "kite"]
Both normalize to "kite" in lowercase form. The problem requires returning the first occurrence, which is "KiTe".
The implementation handles this correctly by only inserting into the map if the key does not already exist.
Exact Match Must Override Other Matches
Consider:
wordlist = ["KiTe"]
query = "KiTe"
This query also matches under case-insensitive and vowel-error rules, but the exact match rule has higher priority.
The implementation checks the exact match set first, guaranteeing the correct precedence order.
Words with Different Lengths
A naive implementation might incorrectly treat words with missing or extra letters as vowel matches.
For example:
wordlist = ["YellOw"]
query = "yllw"
Even though the consonants align somewhat, the words are not equivalent because characters are missing.
The normalization function preserves word length and consonant positions, so these words produce different normalized forms and correctly fail to match.