LeetCode 288 - Unique Word Abbreviation
The problem asks us to design a data structure that determines whether a word's abbreviation is unique within a given dictionary.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Design
Solution
Problem Understanding
The problem asks us to design a data structure that determines whether a word's abbreviation is unique within a given dictionary.
An abbreviation is defined using three parts:
- The first character
- The number of characters between the first and last character
- The last character
For example:
"dog"becomes"d1g""internationalization"becomes"i18n""it"remains"it"because words with length less than or equal to 2 are not shortened
The ValidWordAbbr class is initialized with a dictionary of words. After initialization, we must answer queries using isUnique(word).
A word is considered unique if either:
- No dictionary word shares the same abbreviation
- Every dictionary word that shares the abbreviation is exactly the same word
This second condition is extremely important. If the dictionary contains "cake" and we query "cake", the answer should still be true as long as no other distinct word shares the abbreviation "c2e".
The input size gives us important design constraints:
- Up to
3 * 10^4dictionary words - Up to
5000calls toisUnique
A naive scan of the entire dictionary for every query would become expensive. Since abbreviation computation is simple and repeated often, preprocessing the dictionary into a hash-based structure is the natural optimization.
Several edge cases are important:
- Duplicate words may appear in the dictionary
- Very short words of length
1or2abbreviate to themselves - Multiple different words can map to the same abbreviation
- Query words may or may not already exist in the dictionary
Correct handling of duplicates is especially important. If "deer" appears multiple times, it should still behave as a single unique word associated with abbreviation "d2r".
Approaches
Brute Force Approach
The brute-force solution stores the entire dictionary and, for every isUnique(word) query, scans through all dictionary words.
For each dictionary word:
- Compute its abbreviation
- Compute the query word's abbreviation
- Compare the abbreviations
- If the abbreviations match but the words differ, return
false
If the scan finishes without finding a conflicting word, return true.
This approach is correct because it directly implements the problem definition. Every possible conflicting dictionary word is checked explicitly.
However, this solution is inefficient because each query requires scanning the full dictionary. With up to 30000 words and 5000 queries, the worst-case runtime becomes very large.
Optimal Approach
The key insight is that the only thing that matters for uniqueness is the relationship between:
- An abbreviation
- The set of words that produce it
Instead of repeatedly scanning the dictionary, we can preprocess all abbreviations into a hash map.
For every abbreviation, we store the set of dictionary words that map to it.
Then during isUnique(word):
- If the abbreviation does not exist in the map, the word is unique
- If the abbreviation exists and the associated set contains only the same word, the word is unique
- Otherwise, it is not unique
This transforms repeated expensive scans into constant-time hash lookups.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) per query | O(n) | Scans entire dictionary for every lookup |
| Optimal | O(n) preprocessing, O(1) per query | O(n) | Uses hash map from abbreviation to set of words |
Algorithm Walkthrough
Optimal Algorithm
- Create a helper function that computes the abbreviation of a word.
If the word length is less than or equal to 2, return the word itself because short words are never compressed.
Otherwise construct:
- first character
- number of middle characters
- last character
For example:
"door"→"d2r""cake"→"c2e"
- During initialization, build a hash map where:
- key = abbreviation
- value = set of words that share that abbreviation
This preprocessing step groups all dictionary words by abbreviation.
3. When isUnique(word) is called, compute the word's abbreviation.
4. Check whether the abbreviation exists in the hash map.
If it does not exist, no dictionary word shares the abbreviation, so return true.
5. If the abbreviation exists, retrieve the associated set of words.
6. The word is unique only if the set contains exactly one word and that word equals the query word.
This condition handles both:
- duplicate dictionary entries
- words already present in the dictionary
- Otherwise return
falsebecause another distinct word shares the abbreviation.
Why it works
The algorithm works because the hash map completely captures the relationship between abbreviations and dictionary words.
For any queried word, only words with the same abbreviation can possibly violate uniqueness. The map lets us access exactly those words in constant time.
The correctness invariant is:
For every abbreviation, the map stores the complete set of distinct dictionary words that generate it.
Therefore, checking whether the set is empty or contains only the queried word exactly matches the problem definition.
Python Solution
from typing import List
from collections import defaultdict
class ValidWordAbbr:
def __init__(self, dictionary: List[str]):
self.abbr_map = defaultdict(set)
for word in dictionary:
abbreviation = self._get_abbreviation(word)
self.abbr_map[abbreviation].add(word)
def _get_abbreviation(self, word: str) -> str:
if len(word) <= 2:
return word
return word[0] + str(len(word) - 2) + word[-1]
def isUnique(self, word: str) -> bool:
abbreviation = self._get_abbreviation(word)
if abbreviation not in self.abbr_map:
return True
matching_words = self.abbr_map[abbreviation]
return matching_words == {word}
# Your ValidWordAbbr object will be instantiated and called as such:
# obj = ValidWordAbbr(dictionary)
# param_1 = obj.isUnique(word)
The implementation begins by constructing abbr_map, a dictionary that maps each abbreviation to the set of words producing it.
The _get_abbreviation helper centralizes abbreviation logic so it is implemented consistently in both initialization and query handling.
Using a set is important because the dictionary may contain duplicate words. We only care about distinct words associated with an abbreviation.
In isUnique, we first compute the abbreviation for the query word. If the abbreviation does not exist in the map, the word is automatically unique.
If the abbreviation exists, we compare the associated set against {word}. This is a concise and reliable way to verify that the queried word is the only distinct word sharing the abbreviation.
Go Solution
package main
type ValidWordAbbr struct {
abbrMap map[string]map[string]bool
}
func Constructor(dictionary []string) ValidWordAbbr {
abbrMap := make(map[string]map[string]bool)
for _, word := range dictionary {
abbr := getAbbreviation(word)
if _, exists := abbrMap[abbr]; !exists {
abbrMap[abbr] = make(map[string]bool)
}
abbrMap[abbr][word] = true
}
return ValidWordAbbr{
abbrMap: abbrMap,
}
}
func getAbbreviation(word string) string {
if len(word) <= 2 {
return word
}
return string(word[0]) +
string(rune(len(word)-2+'0')) +
string(word[len(word)-1])
}
func (this *ValidWordAbbr) IsUnique(word string) bool {
abbr := getAbbreviation(word)
words, exists := this.abbrMap[abbr]
if !exists {
return true
}
return len(words) == 1 && words[word]
}
/**
* Your ValidWordAbbr object will be instantiated and called as such:
* obj := Constructor(dictionary);
* param_1 := obj.IsUnique(word);
*/
One important Go-specific detail is the representation of sets. Go does not have a built-in set type, so we use map[string]bool.
Another detail is abbreviation construction. Since the middle count may exceed one digit, a production implementation should ideally use strconv.Itoa(len(word) - 2) instead of direct rune conversion. A more robust version is shown below:
return string(word[0]) + strconv.Itoa(len(word)-2) + string(word[len(word)-1])
This handles multi-digit counts correctly, such as "internationalization" becoming "i18n".
Unlike Python's defaultdict, Go requires explicit map initialization before insertion.
Worked Examples
Example 1
Dictionary:
["deer", "door", "cake", "card"]
Initialization Phase
| Word | Abbreviation | Map State |
|---|---|---|
| deer | d2r | d2r → {deer} |
| door | d2r | d2r → {deer, door} |
| cake | c2e | c2e → {cake} |
| card | c2d | c2d → {card} |
Final map:
| Abbreviation | Words |
|---|---|
| d2r | {deer, door} |
| c2e | {cake} |
| c2d | {card} |
Query: isUnique("dear")
Abbreviation:
dear → d2r
Lookup:
| Abbreviation | Stored Words |
|---|---|
| d2r | {deer, door} |
Because the set contains words different from "dear", return:
false
Query: isUnique("cart")
Abbreviation:
cart → c2t
c2t does not exist in the map.
Return:
true
Query: isUnique("cane")
Abbreviation:
cane → c2e
Lookup:
| Abbreviation | Stored Words |
|---|---|
| c2e | {cake} |
Since "cake" differs from "cane", return:
false
Query: isUnique("make")
Abbreviation:
make → m2e
No such abbreviation exists.
Return:
true
Query: isUnique("cake")
Abbreviation:
cake → c2e
Lookup:
| Abbreviation | Stored Words |
|---|---|
| c2e | {cake} |
The only matching word is exactly "cake".
Return:
true
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + q) | O(n) preprocessing and O(1) average per query |
| Space | O(n) | Stores abbreviations and associated word sets |
Let:
n= number of dictionary wordsq= number ofisUniquequeries
Building the hash map requires visiting each dictionary word once, giving O(n) preprocessing time.
Each query performs:
- one abbreviation computation
- one hash lookup
- one small set comparison
All of these are constant-time operations on average.
The space complexity is O(n) because every distinct dictionary word may be stored in the hash map.
Test Cases
from typing import List
# Example test
obj = ValidWordAbbr(["deer", "door", "cake", "card"])
assert obj.isUnique("dear") == False # conflicting abbreviation
assert obj.isUnique("cart") == True # no matching abbreviation
assert obj.isUnique("cane") == False # conflicts with cake
assert obj.isUnique("make") == True # unique abbreviation
assert obj.isUnique("cake") == True # same word only
# Duplicate dictionary entries
obj = ValidWordAbbr(["hello", "hello"])
assert obj.isUnique("hello") == True # duplicates should not matter
assert obj.isUnique("hallo") == False # same abbreviation, different word
# Short words
obj = ValidWordAbbr(["it", "to"])
assert obj.isUnique("it") == True # exact same short word
assert obj.isUnique("at") == True # no abbreviation conflict
# Single-character words
obj = ValidWordAbbr(["a", "b"])
assert obj.isUnique("a") == True # exact match
assert obj.isUnique("c") == True # unique single char
# Multiple conflicts
obj = ValidWordAbbr(["deer", "door", "dare"])
assert obj.isUnique("dear") == False # several conflicting words
# Empty conflict set
obj = ValidWordAbbr([])
assert obj.isUnique("anything") == True # empty dictionary
# Long abbreviation
obj = ValidWordAbbr(["internationalization"])
assert obj.isUnique("interoperability") == True # different abbreviation
assert obj.isUnique("internationalization") == True # exact same word
| Test | Why |
|---|---|
| Standard example | Verifies core functionality |
| Duplicate entries | Ensures duplicates do not create false conflicts |
| Short words | Validates length ≤ 2 handling |
| Single-character words | Tests smallest valid input |
| Multiple conflicts | Ensures many words sharing abbreviation are handled |
| Empty dictionary | Verifies trivial uniqueness |
| Long words | Tests multi-digit abbreviation counts |
Edge Cases
One important edge case is duplicate dictionary entries. If "hello" appears multiple times, we should not treat it as conflicting with itself. A naive implementation using counts instead of distinct words might incorrectly return false. Using a set solves this naturally because duplicate insertions collapse into a single stored word.
Another important edge case involves very short words. Words with length 1 or 2 are not abbreviated according to the normal rule. A buggy implementation might incorrectly convert "it" into "i0t". The helper function explicitly returns the original word unchanged when the length is less than or equal to 2.
A third subtle case occurs when the queried word already exists in the dictionary. For example, if the dictionary contains only "cake", then isUnique("cake") should return true. The implementation handles this by verifying that the abbreviation's associated set equals exactly {word}. This ensures the queried word is the only distinct word sharing the abbreviation.
A final edge case involves abbreviations shared by many distinct words. For example:
deer → d2r
door → d2r
dear → d2r
A naive solution might stop after finding one matching word and incorrectly assume uniqueness. Our implementation stores the full set of conflicting words, ensuring all conflicts are accounted for correctly.