LeetCode 966 - Vowel Spellchecker

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String

Solution

Problem Understanding

The problem asks us to build a spellchecker with three levels of matching priority. For every query word, we must search the wordlist and return the best matching word according to a strict precedence order.

The first and highest priority rule is an exact match. If the query exactly matches a word in the wordlist, including the same capitalization, we immediately return that exact word.

If there is no exact match, we try a case-insensitive match. This means words are considered equal if their lowercase forms are the same. When multiple words satisfy this rule, we must return the first one that appeared in the original wordlist.

If there is still no match, we try a vowel-error match. In this rule, vowels are treated as interchangeable. After converting both words to lowercase, every vowel character (a, e, i, o, u) is considered equivalent. For example:

"KiTe" and "keti" match because:
lowercase forms are "kite" and "keti"
replacing vowels with a placeholder produces the same normalized form

Again, if multiple words match, we return the first occurrence from the original wordlist.

If none of the rules produce a match, we return an empty string.

The input consists of:

wordlist, the dictionary of valid words
queries, the list of words we must correct

The output is a list where each query is replaced by the best matching word according to the precedence rules.

The constraints are relatively small:

Up to 5000 words
Each word length up to 7

Even though the input size is not enormous, a naive implementation that repeatedly scans the entire wordlist for every query can still become inefficient. Since each query may require several kinds of comparisons, preprocessing the dictionary into efficient lookup structures is the natural optimization.

There are several important edge cases:

Multiple words may map to the same lowercase form, such as "KiTe" and "kite". We must keep the first occurrence.
Multiple words may map to the same vowel-normalized form. Again, the first occurrence must be preserved.
Exact matches must always override case-insensitive or vowel matches.
Queries with missing or extra characters cannot match, even if vowels differ.
Words with no vowels should still work correctly.

Approaches

Brute Force Approach

The simplest solution is to process each query independently and scan the entire wordlist multiple times.

For every query:

First scan for an exact match.
If none exists, scan again for a case-insensitive match.
If none exists, scan again for a vowel-normalized match.

To implement vowel matching, we can normalize words by:

converting to lowercase
replacing every vowel with a common placeholder such as '*'

For example:

"KiTe" → "k*t*"
"keti" → "k*t*"

This brute-force approach is correct because it explicitly checks the rules in priority order and returns the first valid match.

However, it is inefficient because every query may require scanning the entire wordlist multiple times. With up to 5000 queries and 5000 dictionary words, this becomes unnecessarily expensive.

Optimal Approach

The key observation is that the matching rules depend on deterministic transformations of the words.

We can preprocess the wordlist into three lookup structures:

A set for exact matches
A hash map from lowercase word → first matching dictionary word
A hash map from vowel-normalized word → first matching dictionary word

Then each query can be answered in constant time lookups.

The crucial detail is preserving the first occurrence from the original wordlist. We achieve this by only inserting into the hash maps if the key does not already exist.

The preprocessing transforms repeated scanning into efficient hash lookups.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(Q × W × L)	O(1)	Repeatedly scans the wordlist for every query
Optimal	O(W × L + Q × L)	O(W × L)	Uses preprocessing with hash maps for fast lookup

Where:

Q = number of queries
W = number of words in wordlist
L = maximum word length

Algorithm Walkthrough

Step 1: Create an Exact Match Set

We store every word from wordlist in a hash set.

This allows constant time checking for the highest-priority rule, exact case-sensitive matches.

For example:

{"KiTe", "kite", "hare", "Hare"}

Step 2: Build a Case-Insensitive Map

For every word in the wordlist:

Convert it to lowercase
Store:

lowercase word → original word

We only insert if the lowercase key does not already exist.

This preserves the first occurrence rule.

Example:

Word	Lowercase Key	Stored Value
KiTe	kite	KiTe
kite	kite	ignored
hare	hare	hare
Hare	hare	ignored

Final map:

{
    "kite": "KiTe",
    "hare": "hare"
}

Step 3: Build a Vowel-Normalized Map

We define a normalization function:

Convert the word to lowercase
Replace every vowel with '*'

Examples:

Word	Normalized
KiTe	k_t_
keto	k_t_
Hare	h_r_

Again, we only store the first occurrence.

Example map:

{
    "k*t*": "KiTe",
    "h*r*": "hare"
}

Step 4: Process Each Query

For every query:

Check exact match
Check lowercase match
Check vowel-normalized match
Otherwise return ""

The order is critical because the problem specifies strict precedence rules.

Step 5: Collect Results

Append the chosen result for each query into the answer list.

Why it works

The algorithm works because every matching rule corresponds to a deterministic transformation:

Exact matching uses the original word
Case-insensitive matching uses lowercase normalization
Vowel-error matching uses vowel normalization

By preprocessing the dictionary into hash maps keyed by these normalized forms, we can directly retrieve the correct first matching word. Since queries are checked in strict priority order, the returned result always follows the required precedence rules.

Python Solution

from typing import List

class Solution:
    def spellchecker(self, wordlist: List[str], queries: List[str]) -> List[str]:
        vowels = set("aeiou")

        def normalize(word: str) -> str:
            chars = []
            
            for ch in word.lower():
                if ch in vowels:
                    chars.append("*")
                else:
                    chars.append(ch)

            return "".join(chars)

        exact_words = set(wordlist)

        case_map = {}
        vowel_map = {}

        for word in wordlist:
            lower_word = word.lower()
            normalized_word = normalize(word)

            if lower_word not in case_map:
                case_map[lower_word] = word

            if normalized_word not in vowel_map:
                vowel_map[normalized_word] = word

        result = []

        for query in queries:
            if query in exact_words:
                result.append(query)
                continue

            lower_query = query.lower()

            if lower_query in case_map:
                result.append(case_map[lower_query])
                continue

            normalized_query = normalize(query)

            if normalized_query in vowel_map:
                result.append(vowel_map[normalized_query])
                continue

            result.append("")

        return result

The implementation begins by defining a normalization helper function. This function converts a word to lowercase and replaces every vowel with '*'. Using a shared placeholder ensures that all vowel variations map to the same normalized representation.

The exact_words set supports constant time exact matching.

The case_map dictionary stores the first word associated with each lowercase form. The condition:

if lower_word not in case_map:

is essential because the problem requires preserving the first matching word from the original wordlist.

The vowel_map dictionary works similarly, but uses the vowel-normalized representation as the key.

When processing queries, the implementation checks the rules in exact precedence order:

Exact match
Case-insensitive match
Vowel-error match

As soon as a match is found, the corresponding word is appended to the result list.

Go Solution

package main

import "strings"

func spellchecker(wordlist []string, queries []string) []string {
	vowels := map[byte]bool{
		'a': true,
		'e': true,
		'i': true,
		'o': true,
		'u': true,
	}

	normalize := func(word string) string {
		word = strings.ToLower(word)

		chars := []byte(word)

		for i := 0; i < len(chars); i++ {
			if vowels[chars[i]] {
				chars[i] = '*'
			}
		}

		return string(chars)
	}

	exactWords := make(map[string]bool)

	caseMap := make(map[string]string)
	vowelMap := make(map[string]string)

	for _, word := range wordlist {
		exactWords[word] = true

		lowerWord := strings.ToLower(word)
		normalizedWord := normalize(word)

		if _, exists := caseMap[lowerWord]; !exists {
			caseMap[lowerWord] = word
		}

		if _, exists := vowelMap[normalizedWord]; !exists {
			vowelMap[normalizedWord] = word
		}
	}

	result := make([]string, 0, len(queries))

	for _, query := range queries {
		if exactWords[query] {
			result = append(result, query)
			continue
		}

		lowerQuery := strings.ToLower(query)

		if word, exists := caseMap[lowerQuery]; exists {
			result = append(result, word)
			continue
		}

		normalizedQuery := normalize(query)

		if word, exists := vowelMap[normalizedQuery]; exists {
			result = append(result, word)
			continue
		}

		result = append(result, "")
	}

	return result
}

The Go solution follows the same algorithmic structure as the Python version.

A few Go-specific details are worth noting:

Go does not have a built-in set type, so map[string]bool is used for exact matches.
Strings are immutable in Go, so normalization converts the string into a byte slice before modifying characters.
The exists idiom is used to preserve first occurrences in the hash maps.
The result slice is initialized with capacity equal to the number of queries for efficiency.

Worked Examples

Example 1

wordlist = ["KiTe","kite","hare","Hare"]

queries = [
    "kite","Kite","KiTe","Hare",
    "HARE","Hear","hear","keti",
    "keet","keto"
]

Preprocessing

Exact Set

Word
KiTe
kite
hare
Hare

Case Map

Lowercase	Stored Word
kite	KiTe
hare	hare

Vowel Map

Normalized	Stored Word
k_t_	KiTe
h_r_	hare

Query Processing

Query	Exact Match	Case Match	Vowel Match	Result
kite	yes	-	-	kite
Kite	no	KiTe	-	KiTe
KiTe	yes	-	-	KiTe
Hare	yes	-	-	Hare
HARE	no	hare	-	hare
Hear	no	no	no	""
hear	no	no	no	""
keti	no	no	KiTe	KiTe
keet	no	no	no	""
keto	no	no	KiTe	KiTe

Final output:

["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]

Example 2

wordlist = ["yellow"]
queries = ["YellOw"]

Preprocessing

Exact Set

{"yellow"}

Case Map

Lowercase	Stored Word
yellow	yellow

Vowel Map

Normalized	Stored Word
y_ll_w	yellow

Query Processing

Query	Exact Match	Case Match	Result
YellOw	no	yellow	yellow

Final output:

["yellow"]

Complexity Analysis

Measure	Complexity	Explanation
Time	O(W × L + Q × L)	Each word and query is processed a constant number of times
Space	O(W × L)	Hash maps and sets store transformed versions of the words

The preprocessing phase processes every dictionary word once, and each normalization takes O(L) time where L is the word length.

Each query also performs a constant number of hash lookups and at most one normalization operation, resulting in O(L) time per query.

The space usage comes from storing:

the exact set
the lowercase map
the vowel-normalized map

All together, these structures store at most one transformed representation per word.

Test Cases

from typing import List

class Solution:
    def spellchecker(self, wordlist: List[str], queries: List[str]) -> List[str]:
        vowels = set("aeiou")

        def normalize(word: str) -> str:
            chars = []

            for ch in word.lower():
                if ch in vowels:
                    chars.append("*")
                else:
                    chars.append(ch)

            return "".join(chars)

        exact_words = set(wordlist)

        case_map = {}
        vowel_map = {}

        for word in wordlist:
            lower_word = word.lower()
            normalized_word = normalize(word)

            if lower_word not in case_map:
                case_map[lower_word] = word

            if normalized_word not in vowel_map:
                vowel_map[normalized_word] = word

        result = []

        for query in queries:
            if query in exact_words:
                result.append(query)
            elif query.lower() in case_map:
                result.append(case_map[query.lower()])
            elif normalize(query) in vowel_map:
                result.append(vowel_map[normalize(query)])
            else:
                result.append("")

        return result

sol = Solution()

# Provided example 1
assert sol.spellchecker(
    ["KiTe","kite","hare","Hare"],
    ["kite","Kite","KiTe","Hare","HARE","Hear","hear","keti","keet","keto"]
) == ["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]

# Provided example 2
assert sol.spellchecker(
    ["yellow"],
    ["YellOw"]
) == ["yellow"]

# Exact match should override all others
assert sol.spellchecker(
    ["KiTe"],
    ["KiTe"]
) == ["KiTe"]

# Case insensitive match
assert sol.spellchecker(
    ["Yellow"],
    ["yellow"]
) == ["Yellow"]

# Vowel error match
assert sol.spellchecker(
    ["YellOw"],
    ["yollow"]
) == ["YellOw"]

# No match due to missing letters
assert sol.spellchecker(
    ["YellOw"],
    ["yllw"]
) == [""]

# No match due to extra letters
assert sol.spellchecker(
    ["YellOw"],
    ["yeellow"]
) == [""]

# First occurrence must be preserved
assert sol.spellchecker(
    ["KiTe", "kite"],
    ["KITE"]
) == ["KiTe"]

# Multiple vowel variants
assert sol.spellchecker(
    ["banana"],
    ["benene", "binini", "bonono"]
) == ["banana", "banana", "banana"]

# Words without vowels
assert sol.spellchecker(
    ["rhythm"],
    ["RHYTHM"]
) == ["rhythm"]

# Empty result for unrelated word
assert sol.spellchecker(
    ["apple"],
    ["orange"]
) == [""]

print("All tests passed!")

Test	Why
Provided example 1	Validates all precedence rules together
Provided example 2	Validates case-insensitive matching
Exact match override	Ensures highest priority rule works
Case insensitive match	Confirms lowercase normalization
Vowel error match	Confirms vowel normalization
Missing letters	Ensures length differences do not match
Extra letters	Prevents incorrect partial matches
First occurrence preservation	Validates insertion logic in maps
Multiple vowel variants	Confirms all vowels are interchangeable
Words without vowels	Ensures consonant-only words work
Unrelated word	Ensures empty string is returned correctly

Edge Cases

Multiple Words Mapping to the Same Lowercase Form

A common source of bugs is overwriting earlier entries in the case-insensitive map.

For example:

wordlist = ["KiTe", "kite"]

Both normalize to "kite" in lowercase form. The problem requires returning the first occurrence, which is "KiTe".

The implementation handles this correctly by only inserting into the map if the key does not already exist.

Exact Match Must Override Other Matches

Consider:

wordlist = ["KiTe"]
query = "KiTe"

This query also matches under case-insensitive and vowel-error rules, but the exact match rule has higher priority.

The implementation checks the exact match set first, guaranteeing the correct precedence order.

Words with Different Lengths

A naive implementation might incorrectly treat words with missing or extra letters as vowel matches.

For example:

wordlist = ["YellOw"]
query = "yllw"

Even though the consonants align somewhat, the words are not equivalent because characters are missing.

The normalization function preserves word length and consonant positions, so these words produce different normalized forms and correctly fail to match.