LeetCode 966 - Vowel Spellchecker

The problem asks us to build a spellchecker with three levels of matching priority. For every query word, we must search the wordlist and return the best matching word according to a strict precedence order. The first and highest priority rule is an exact match.

LeetCode Problem 966

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String

Solution

Problem Understanding

The problem asks us to build a spellchecker with three levels of matching priority. For every query word, we must search the wordlist and return the best matching word according to a strict precedence order.

The first and highest priority rule is an exact match. If the query exactly matches a word in the wordlist, including the same capitalization, we immediately return that exact word.

If there is no exact match, we try a case-insensitive match. This means words are considered equal if their lowercase forms are the same. When multiple words satisfy this rule, we must return the first one that appeared in the original wordlist.

If there is still no match, we try a vowel-error match. In this rule, vowels are treated as interchangeable. After converting both words to lowercase, every vowel character (a, e, i, o, u) is considered equivalent. For example:

  • "KiTe" and "keti" match because:

  • lowercase forms are "kite" and "keti"

  • replacing vowels with a placeholder produces the same normalized form

Again, if multiple words match, we return the first occurrence from the original wordlist.

If none of the rules produce a match, we return an empty string.

The input consists of:

  • wordlist, the dictionary of valid words
  • queries, the list of words we must correct

The output is a list where each query is replaced by the best matching word according to the precedence rules.

The constraints are relatively small:

  • Up to 5000 words
  • Each word length up to 7

Even though the input size is not enormous, a naive implementation that repeatedly scans the entire wordlist for every query can still become inefficient. Since each query may require several kinds of comparisons, preprocessing the dictionary into efficient lookup structures is the natural optimization.

There are several important edge cases:

  • Multiple words may map to the same lowercase form, such as "KiTe" and "kite". We must keep the first occurrence.
  • Multiple words may map to the same vowel-normalized form. Again, the first occurrence must be preserved.
  • Exact matches must always override case-insensitive or vowel matches.
  • Queries with missing or extra characters cannot match, even if vowels differ.
  • Words with no vowels should still work correctly.

Approaches

Brute Force Approach

The simplest solution is to process each query independently and scan the entire wordlist multiple times.

For every query:

  1. First scan for an exact match.
  2. If none exists, scan again for a case-insensitive match.
  3. If none exists, scan again for a vowel-normalized match.

To implement vowel matching, we can normalize words by:

  • converting to lowercase
  • replacing every vowel with a common placeholder such as '*'

For example:

  • "KiTe""k*t*"
  • "keti""k*t*"

This brute-force approach is correct because it explicitly checks the rules in priority order and returns the first valid match.

However, it is inefficient because every query may require scanning the entire wordlist multiple times. With up to 5000 queries and 5000 dictionary words, this becomes unnecessarily expensive.

Optimal Approach

The key observation is that the matching rules depend on deterministic transformations of the words.

We can preprocess the wordlist into three lookup structures:

  1. A set for exact matches
  2. A hash map from lowercase word → first matching dictionary word
  3. A hash map from vowel-normalized word → first matching dictionary word

Then each query can be answered in constant time lookups.

The crucial detail is preserving the first occurrence from the original wordlist. We achieve this by only inserting into the hash maps if the key does not already exist.

The preprocessing transforms repeated scanning into efficient hash lookups.

Approach Time Complexity Space Complexity Notes
Brute Force O(Q × W × L) O(1) Repeatedly scans the wordlist for every query
Optimal O(W × L + Q × L) O(W × L) Uses preprocessing with hash maps for fast lookup

Where:

  • Q = number of queries
  • W = number of words in wordlist
  • L = maximum word length

Algorithm Walkthrough

Step 1: Create an Exact Match Set

We store every word from wordlist in a hash set.

This allows constant time checking for the highest-priority rule, exact case-sensitive matches.

For example:

{"KiTe", "kite", "hare", "Hare"}

Step 2: Build a Case-Insensitive Map

For every word in the wordlist:

  1. Convert it to lowercase
  2. Store:
  • lowercase word → original word

We only insert if the lowercase key does not already exist.

This preserves the first occurrence rule.

Example:

Word Lowercase Key Stored Value
KiTe kite KiTe
kite kite ignored
hare hare hare
Hare hare ignored

Final map:

{
    "kite": "KiTe",
    "hare": "hare"
}

Step 3: Build a Vowel-Normalized Map

We define a normalization function:

  1. Convert the word to lowercase
  2. Replace every vowel with '*'

Examples:

Word Normalized
KiTe k_t_
keto k_t_
Hare h_r_

Again, we only store the first occurrence.

Example map:

{
    "k*t*": "KiTe",
    "h*r*": "hare"
}

Step 4: Process Each Query

For every query:

  1. Check exact match
  2. Check lowercase match
  3. Check vowel-normalized match
  4. Otherwise return ""

The order is critical because the problem specifies strict precedence rules.

Step 5: Collect Results

Append the chosen result for each query into the answer list.

Why it works

The algorithm works because every matching rule corresponds to a deterministic transformation:

  • Exact matching uses the original word
  • Case-insensitive matching uses lowercase normalization
  • Vowel-error matching uses vowel normalization

By preprocessing the dictionary into hash maps keyed by these normalized forms, we can directly retrieve the correct first matching word. Since queries are checked in strict priority order, the returned result always follows the required precedence rules.

Python Solution

from typing import List

class Solution:
    def spellchecker(self, wordlist: List[str], queries: List[str]) -> List[str]:
        vowels = set("aeiou")

        def normalize(word: str) -> str:
            chars = []
            
            for ch in word.lower():
                if ch in vowels:
                    chars.append("*")
                else:
                    chars.append(ch)

            return "".join(chars)

        exact_words = set(wordlist)

        case_map = {}
        vowel_map = {}

        for word in wordlist:
            lower_word = word.lower()
            normalized_word = normalize(word)

            if lower_word not in case_map:
                case_map[lower_word] = word

            if normalized_word not in vowel_map:
                vowel_map[normalized_word] = word

        result = []

        for query in queries:
            if query in exact_words:
                result.append(query)
                continue

            lower_query = query.lower()

            if lower_query in case_map:
                result.append(case_map[lower_query])
                continue

            normalized_query = normalize(query)

            if normalized_query in vowel_map:
                result.append(vowel_map[normalized_query])
                continue

            result.append("")

        return result

The implementation begins by defining a normalization helper function. This function converts a word to lowercase and replaces every vowel with '*'. Using a shared placeholder ensures that all vowel variations map to the same normalized representation.

The exact_words set supports constant time exact matching.

The case_map dictionary stores the first word associated with each lowercase form. The condition:

if lower_word not in case_map:

is essential because the problem requires preserving the first matching word from the original wordlist.

The vowel_map dictionary works similarly, but uses the vowel-normalized representation as the key.

When processing queries, the implementation checks the rules in exact precedence order:

  1. Exact match
  2. Case-insensitive match
  3. Vowel-error match

As soon as a match is found, the corresponding word is appended to the result list.

Go Solution

package main

import "strings"

func spellchecker(wordlist []string, queries []string) []string {
	vowels := map[byte]bool{
		'a': true,
		'e': true,
		'i': true,
		'o': true,
		'u': true,
	}

	normalize := func(word string) string {
		word = strings.ToLower(word)

		chars := []byte(word)

		for i := 0; i < len(chars); i++ {
			if vowels[chars[i]] {
				chars[i] = '*'
			}
		}

		return string(chars)
	}

	exactWords := make(map[string]bool)

	caseMap := make(map[string]string)
	vowelMap := make(map[string]string)

	for _, word := range wordlist {
		exactWords[word] = true

		lowerWord := strings.ToLower(word)
		normalizedWord := normalize(word)

		if _, exists := caseMap[lowerWord]; !exists {
			caseMap[lowerWord] = word
		}

		if _, exists := vowelMap[normalizedWord]; !exists {
			vowelMap[normalizedWord] = word
		}
	}

	result := make([]string, 0, len(queries))

	for _, query := range queries {
		if exactWords[query] {
			result = append(result, query)
			continue
		}

		lowerQuery := strings.ToLower(query)

		if word, exists := caseMap[lowerQuery]; exists {
			result = append(result, word)
			continue
		}

		normalizedQuery := normalize(query)

		if word, exists := vowelMap[normalizedQuery]; exists {
			result = append(result, word)
			continue
		}

		result = append(result, "")
	}

	return result
}

The Go solution follows the same algorithmic structure as the Python version.

A few Go-specific details are worth noting:

  • Go does not have a built-in set type, so map[string]bool is used for exact matches.
  • Strings are immutable in Go, so normalization converts the string into a byte slice before modifying characters.
  • The exists idiom is used to preserve first occurrences in the hash maps.
  • The result slice is initialized with capacity equal to the number of queries for efficiency.

Worked Examples

Example 1

wordlist = ["KiTe","kite","hare","Hare"]

queries = [
    "kite","Kite","KiTe","Hare",
    "HARE","Hear","hear","keti",
    "keet","keto"
]

Preprocessing

Exact Set

Word
KiTe
kite
hare
Hare

Case Map

Lowercase Stored Word
kite KiTe
hare hare

Vowel Map

Normalized Stored Word
k_t_ KiTe
h_r_ hare

Query Processing

Query Exact Match Case Match Vowel Match Result
kite yes - - kite
Kite no KiTe - KiTe
KiTe yes - - KiTe
Hare yes - - Hare
HARE no hare - hare
Hear no no no ""
hear no no no ""
keti no no KiTe KiTe
keet no no no ""
keto no no KiTe KiTe

Final output:

["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]

Example 2

wordlist = ["yellow"]
queries = ["YellOw"]

Preprocessing

Exact Set

{"yellow"}

Case Map

Lowercase Stored Word
yellow yellow

Vowel Map

Normalized Stored Word
y_ll_w yellow

Query Processing

Query Exact Match Case Match Result
YellOw no yellow yellow

Final output:

["yellow"]

Complexity Analysis

Measure Complexity Explanation
Time O(W × L + Q × L) Each word and query is processed a constant number of times
Space O(W × L) Hash maps and sets store transformed versions of the words

The preprocessing phase processes every dictionary word once, and each normalization takes O(L) time where L is the word length.

Each query also performs a constant number of hash lookups and at most one normalization operation, resulting in O(L) time per query.

The space usage comes from storing:

  • the exact set
  • the lowercase map
  • the vowel-normalized map

All together, these structures store at most one transformed representation per word.

Test Cases

from typing import List

class Solution:
    def spellchecker(self, wordlist: List[str], queries: List[str]) -> List[str]:
        vowels = set("aeiou")

        def normalize(word: str) -> str:
            chars = []

            for ch in word.lower():
                if ch in vowels:
                    chars.append("*")
                else:
                    chars.append(ch)

            return "".join(chars)

        exact_words = set(wordlist)

        case_map = {}
        vowel_map = {}

        for word in wordlist:
            lower_word = word.lower()
            normalized_word = normalize(word)

            if lower_word not in case_map:
                case_map[lower_word] = word

            if normalized_word not in vowel_map:
                vowel_map[normalized_word] = word

        result = []

        for query in queries:
            if query in exact_words:
                result.append(query)
            elif query.lower() in case_map:
                result.append(case_map[query.lower()])
            elif normalize(query) in vowel_map:
                result.append(vowel_map[normalize(query)])
            else:
                result.append("")

        return result

sol = Solution()

# Provided example 1
assert sol.spellchecker(
    ["KiTe","kite","hare","Hare"],
    ["kite","Kite","KiTe","Hare","HARE","Hear","hear","keti","keet","keto"]
) == ["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]

# Provided example 2
assert sol.spellchecker(
    ["yellow"],
    ["YellOw"]
) == ["yellow"]

# Exact match should override all others
assert sol.spellchecker(
    ["KiTe"],
    ["KiTe"]
) == ["KiTe"]

# Case insensitive match
assert sol.spellchecker(
    ["Yellow"],
    ["yellow"]
) == ["Yellow"]

# Vowel error match
assert sol.spellchecker(
    ["YellOw"],
    ["yollow"]
) == ["YellOw"]

# No match due to missing letters
assert sol.spellchecker(
    ["YellOw"],
    ["yllw"]
) == [""]

# No match due to extra letters
assert sol.spellchecker(
    ["YellOw"],
    ["yeellow"]
) == [""]

# First occurrence must be preserved
assert sol.spellchecker(
    ["KiTe", "kite"],
    ["KITE"]
) == ["KiTe"]

# Multiple vowel variants
assert sol.spellchecker(
    ["banana"],
    ["benene", "binini", "bonono"]
) == ["banana", "banana", "banana"]

# Words without vowels
assert sol.spellchecker(
    ["rhythm"],
    ["RHYTHM"]
) == ["rhythm"]

# Empty result for unrelated word
assert sol.spellchecker(
    ["apple"],
    ["orange"]
) == [""]

print("All tests passed!")
Test Why
Provided example 1 Validates all precedence rules together
Provided example 2 Validates case-insensitive matching
Exact match override Ensures highest priority rule works
Case insensitive match Confirms lowercase normalization
Vowel error match Confirms vowel normalization
Missing letters Ensures length differences do not match
Extra letters Prevents incorrect partial matches
First occurrence preservation Validates insertion logic in maps
Multiple vowel variants Confirms all vowels are interchangeable
Words without vowels Ensures consonant-only words work
Unrelated word Ensures empty string is returned correctly

Edge Cases

Multiple Words Mapping to the Same Lowercase Form

A common source of bugs is overwriting earlier entries in the case-insensitive map.

For example:

wordlist = ["KiTe", "kite"]

Both normalize to "kite" in lowercase form. The problem requires returning the first occurrence, which is "KiTe".

The implementation handles this correctly by only inserting into the map if the key does not already exist.

Exact Match Must Override Other Matches

Consider:

wordlist = ["KiTe"]
query = "KiTe"

This query also matches under case-insensitive and vowel-error rules, but the exact match rule has higher priority.

The implementation checks the exact match set first, guaranteeing the correct precedence order.

Words with Different Lengths

A naive implementation might incorrectly treat words with missing or extra letters as vowel matches.

For example:

wordlist = ["YellOw"]
query = "yllw"

Even though the consonants align somewhat, the words are not equivalent because characters are missing.

The normalization function preserves word length and consonant positions, so these words produce different normalized forms and correctly fail to match.