LeetCode 411 - Minimum Unique Word Abbreviation
This problem asks us to generate the shortest possible abbreviation for a given target word such that the abbreviation cannot also represent any word in the dictionary. An abbreviation replaces one or more non-adjacent substrings with their lengths.
Difficulty: 🔴 Hard
Topics: Array, String, Backtracking, Bit Manipulation
Solution
LeetCode 411 - Minimum Unique Word Abbreviation
Problem Understanding
This problem asks us to generate the shortest possible abbreviation for a given target word such that the abbreviation cannot also represent any word in the dictionary.
An abbreviation replaces one or more non-adjacent substrings with their lengths. For example, the word "apple" can become:
"5"because the whole word is replaced"a4"because"pple"is replaced"1p3"because the first character is replaced,"p"is kept, and the last three characters are replaced
The important rule is that replaced sections cannot be adjacent. That means abbreviations like "s55n" are invalid because the two numeric parts would represent adjacent omitted regions.
The input consists of:
target, the word we want to abbreviatedictionary, a list of words
The output must be:
- A valid abbreviation of
target - The shortest possible abbreviation
- An abbreviation that does not also match any dictionary word
Two words share an abbreviation if the kept characters appear in the same positions and the omitted segments can account for the remaining characters.
The constraints are extremely important here:
target.length <= 21log2(n) + m <= 21
The small target length strongly suggests that bitmasking over character positions is feasible. Since 2^21 is about two million, enumerating masks becomes possible with pruning and optimization.
Another key observation is that dictionary words with different lengths can never conflict with abbreviations of the target. Only dictionary words with the same length matter.
Important edge cases include:
- Empty dictionary, where the shortest abbreviation is simply the length of the target
- Dictionary words of different lengths, which should be ignored
- Multiple shortest answers, where returning any valid one is acceptable
- Target length of 1, where the answer may need to preserve the only character
- Cases where every highly compressed abbreviation conflicts, forcing us to preserve more letters
Approaches
Brute Force Approach
A brute force solution would generate every possible abbreviation of the target string and test whether each abbreviation conflicts with any dictionary word.
Since each character can either be kept or omitted, there are 2^m possible masks for a target of length m.
For every mask:
- Construct the abbreviation
- Compare it against every dictionary word
- Determine whether the abbreviation could also represent that dictionary word
This works because it exhaustively checks all possibilities and guarantees that the shortest valid abbreviation is found.
However, the implementation becomes very expensive. Checking abbreviation equivalence directly for every generated abbreviation involves repeated parsing and matching operations.
The brute force complexity grows rapidly because:
- There are
2^mmasks - Each mask may need checking against up to
ndictionary words - Each check may take
O(m)
This leads to roughly O(2^m * n * m) time.
Key Insight for the Optimal Solution
The crucial insight is that an abbreviation is uniquely determined by the positions we keep.
Suppose we keep characters at certain positions and abbreviate everything else. A dictionary word conflicts with our abbreviation if all kept positions contain the same characters as the target.
This naturally leads to bitmasking.
For each dictionary word of the same length:
- Compute a difference mask
- A bit is
1if the dictionary word differs from the target at that position
Now consider a candidate abbreviation mask:
1means we keep the character0means we abbreviate it
The abbreviation distinguishes the target from a dictionary word if:
candidate_mask & difference_mask != 0
This means at least one kept position exposes a differing character.
So the problem becomes:
Find the mask with minimum abbreviation length such that it intersects every dictionary difference mask.
This transforms the problem into a compact bitmask search problem.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^m * n * m) | O(m) | Generates all abbreviations and explicitly validates them |
| Optimal | O(2^m * n) | O(n) | Uses bitmask pruning and difference masks |
Algorithm Walkthrough
Step 1: Filter Dictionary Words
Only dictionary words with the same length as the target can possibly share an abbreviation.
For example:
target = "apple"
dictionary = ["blade", "orange"]
Only "blade" matters because "orange" has length 6.
This immediately reduces unnecessary work.
Step 2: Build Difference Masks
For every remaining dictionary word:
- Compare it with the target character by character
- Set a bit whenever the characters differ
Example:
target = apple
word = blade
Character comparison:
| Position | Target | Word | Different |
|---|---|---|---|
| 0 | a | b | yes |
| 1 | p | l | yes |
| 2 | p | a | yes |
| 3 | l | d | yes |
| 4 | e | e | no |
Difference mask:
11110
This means positions 0,1,2,3 differ.
Step 3: Enumerate Candidate Masks
A candidate mask determines which positions are preserved.
Example:
00100
means:
- Keep only index 2
- Abbreviate everything else
For "apple" this becomes:
2p2
Step 4: Check Whether a Mask Is Unique
For every dictionary difference mask:
candidate_mask & difference_mask
If the result is zero, then every preserved position matches the dictionary word, meaning the abbreviation is ambiguous.
A valid abbreviation must satisfy:
(candidate_mask & diff) != 0
for every dictionary word.
Step 5: Compute Abbreviation Length
The abbreviation length is not the number of bits.
Consecutive omitted positions merge into one number.
For example:
mask = 10101
produces:
a1p1e
Length:
- 3 letters
- 2 numeric segments
Total = 5.
We compute this by scanning the mask and counting:
- Preserved letters
- Transitions into omitted regions
Step 6: Track the Best Mask
While enumerating masks:
- Compute abbreviation length
- Skip masks worse than the current best
- Validate uniqueness
- Store the best valid mask
Step 7: Convert Mask Into Abbreviation
Finally, reconstruct the abbreviation string:
- Kept bits become letters
- Consecutive omitted sections become counts
Example:
target = apple
mask = 00100
Output:
2p2
Why it works
Every abbreviation corresponds exactly to a set of preserved positions. A dictionary word conflicts with the abbreviation only if it matches the target at every preserved position. Therefore, requiring every dictionary difference mask to intersect the candidate mask guarantees uniqueness. Enumerating all masks ensures we eventually examine the optimal solution, and minimizing abbreviation length guarantees the shortest valid abbreviation is returned.
Python Solution
from typing import List
class Solution:
def minAbbreviation(self, target: str, dictionary: List[str]) -> str:
n = len(target)
dictionary = [word for word in dictionary if len(word) == n]
if not dictionary:
return str(n)
diff_masks = []
for word in dictionary:
mask = 0
for i in range(n):
if target[i] != word[i]:
mask |= (1 << i)
diff_masks.append(mask)
def abbreviation_length(mask: int) -> int:
length = 0
i = 0
while i < n:
if mask & (1 << i):
length += 1
i += 1
else:
length += 1
while i < n and not (mask & (1 << i)):
i += 1
return length
def is_unique(mask: int) -> bool:
for diff in diff_masks:
if (mask & diff) == 0:
return False
return True
best_mask = 0
best_length = float("inf")
for mask in range(1 << n):
current_length = abbreviation_length(mask)
if current_length >= best_length:
continue
if is_unique(mask):
best_mask = mask
best_length = current_length
result = []
count = 0
for i in range(n):
if best_mask & (1 << i):
if count > 0:
result.append(str(count))
count = 0
result.append(target[i])
else:
count += 1
if count > 0:
result.append(str(count))
return "".join(result)
The implementation begins by filtering dictionary words to only those with the same length as the target. This is safe because abbreviations preserve total length structure.
Next, the code constructs difference masks. Each bit represents whether the dictionary word differs from the target at that position.
The abbreviation_length helper computes the true abbreviation length. Consecutive omitted positions form one numeric block, so the function counts transitions into omitted regions rather than individual omitted characters.
The is_unique helper checks whether the current mask distinguishes the target from every dictionary word. If any dictionary word has no overlap with the preserved positions, the abbreviation is ambiguous.
The main loop enumerates all masks from 0 to 2^n - 1. Masks that already exceed the current best abbreviation length are skipped early.
Finally, the best mask is converted into the final abbreviation string.
Go Solution
package main
import (
"strconv"
"strings"
)
func minAbbreviation(target string, dictionary []string) string {
n := len(target)
filtered := []string{}
for _, word := range dictionary {
if len(word) == n {
filtered = append(filtered, word)
}
}
if len(filtered) == 0 {
return strconv.Itoa(n)
}
diffMasks := []int{}
for _, word := range filtered {
mask := 0
for i := 0; i < n; i++ {
if target[i] != word[i] {
mask |= (1 << i)
}
}
diffMasks = append(diffMasks, mask)
}
abbreviationLength := func(mask int) int {
length := 0
i := 0
for i < n {
if (mask & (1 << i)) != 0 {
length++
i++
} else {
length++
for i < n && (mask&(1<<i)) == 0 {
i++
}
}
}
return length
}
isUnique := func(mask int) bool {
for _, diff := range diffMasks {
if (mask & diff) == 0 {
return false
}
}
return true
}
bestMask := 0
bestLength := 1 << 30
for mask := 0; mask < (1 << n); mask++ {
currentLength := abbreviationLength(mask)
if currentLength >= bestLength {
continue
}
if isUnique(mask) {
bestMask = mask
bestLength = currentLength
}
}
var builder strings.Builder
count := 0
for i := 0; i < n; i++ {
if (bestMask & (1 << i)) != 0 {
if count > 0 {
builder.WriteString(strconv.Itoa(count))
count = 0
}
builder.WriteByte(target[i])
} else {
count++
}
}
if count > 0 {
builder.WriteString(strconv.Itoa(count))
}
return builder.String()
}
The Go implementation follows the same algorithmic structure as the Python version.
A strings.Builder is used for efficient string construction during abbreviation reconstruction. Bit operations are performed using standard integer masks, which is safe because the maximum target length is only 21 bits.
Go slices are used for storing filtered dictionary words and difference masks. Since Go does not have Python's dynamic closures with automatic variable capture behavior, helper functions are implemented as local anonymous functions.
Worked Examples
Example 1
target = "apple"
dictionary = ["blade"]
Step 1: Build Difference Mask
Compare "apple" and "blade":
| Position | Target | Dictionary | Different |
|---|---|---|---|
| 0 | a | b | yes |
| 1 | p | l | yes |
| 2 | p | a | yes |
| 3 | l | d | yes |
| 4 | e | e | no |
Difference mask:
11110
Step 2: Try Candidate Masks
| Mask | Abbreviation | Length | Unique? |
|---|---|---|---|
| 00000 | 5 | 1 | No |
| 00001 | a4 | 2 | Yes |
| 10000 | 4e | 2 | No |
The first valid shortest abbreviation is:
a4
Example 2
target = "apple"
dictionary = ["blade", "plain", "amber"]
Difference Masks
For "blade":
11110
For "plain":
10111
For "amber":
11011
Candidate Enumeration
| Mask | Abbreviation | Valid? |
|---|---|---|
| 00000 | 5 | No |
| 00001 | a4 | No |
| 10000 | 4e | No |
| 00010 | 1p3 | Yes |
The answer becomes:
1p3
because the preserved 'p' distinguishes the target from every dictionary word.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(2^m * n) | Enumerates all masks and validates against dictionary masks |
| Space | O(n) | Stores difference masks |
The target length is at most 21, which makes exhaustive bitmask enumeration feasible. Each candidate mask is checked against all dictionary difference masks. The abbreviation length computation is O(m), but since m <= 21, it behaves like a small constant factor in practice.
Test Cases
from typing import List
class Solution:
def minAbbreviation(self, target: str, dictionary: List[str]) -> str:
n = len(target)
dictionary = [word for word in dictionary if len(word) == n]
if not dictionary:
return str(n)
diff_masks = []
for word in dictionary:
mask = 0
for i in range(n):
if target[i] != word[i]:
mask |= (1 << i)
diff_masks.append(mask)
def abbreviation_length(mask: int) -> int:
length = 0
i = 0
while i < n:
if mask & (1 << i):
length += 1
i += 1
else:
length += 1
while i < n and not (mask & (1 << i)):
i += 1
return length
def is_unique(mask: int) -> bool:
for diff in diff_masks:
if (mask & diff) == 0:
return False
return True
best_mask = 0
best_length = float("inf")
for mask in range(1 << n):
current_length = abbreviation_length(mask)
if current_length >= best_length:
continue
if is_unique(mask):
best_mask = mask
best_length = current_length
result = []
count = 0
for i in range(n):
if best_mask & (1 << i):
if count > 0:
result.append(str(count))
count = 0
result.append(target[i])
else:
count += 1
if count > 0:
result.append(str(count))
return "".join(result)
solution = Solution()
# Provided example 1
assert solution.minAbbreviation("apple", ["blade"]) == "a4"
# Provided example 2
assert solution.minAbbreviation("apple", ["blade", "plain", "amber"]) in {
"1p3",
"2p2",
"3l1",
}
# Empty dictionary
assert solution.minAbbreviation("apple", []) == "5"
# Single character target
assert solution.minAbbreviation("a", ["b"]) == "1"
# Different length dictionary words ignored
assert solution.minAbbreviation("apple", ["orange", "hi"]) == "5"
# Need multiple preserved characters
result = solution.minAbbreviation("abcdef", ["abxdef", "abcxef"])
assert result != "6"
# All positions differ
assert solution.minAbbreviation("abc", ["xyz"]) == "1"
# Preserve ending character
result = solution.minAbbreviation("apple", ["apply"])
assert result in {"4e", "a4", "3l1"}
# Stress small max length behavior
result = solution.minAbbreviation(
"abcdefghijklmnopqrstu",
["bbcdefghijklmnopqrstu"]
)
assert len(result) <= 2
Test Summary
| Test | Why |
|---|---|
("apple", ["blade"]) |
Validates the first sample case |
("apple", ["blade", "plain", "amber"]) |
Validates multiple dictionary conflicts |
| Empty dictionary | Ensures immediate shortest abbreviation |
| Single character target | Tests smallest valid target |
| Different length words | Confirms filtering logic |
| Multiple preserved characters | Ensures ambiguity handling |
| Completely different word | Tests minimal distinguishing mask |
| Similar ending character | Tests edge-position preservation |
| Maximum length target | Stresses bitmask scaling |
Edge Cases
One important edge case is an empty dictionary. In this scenario, no abbreviation can conflict with any other word, so the shortest possible abbreviation is simply the target length itself. A naive implementation might still attempt unnecessary searches, but this solution immediately returns str(n).
Another tricky case occurs when dictionary words have different lengths from the target. Such words can never share an abbreviation with the target because abbreviations preserve overall character count structure. The implementation safely filters these words out before processing.
A more subtle edge case happens when many abbreviations conflict and multiple preserved characters are required. For example, if every dictionary word differs at different positions, the solution may need to preserve several letters to uniquely identify the target. The bitmask intersection logic correctly handles this because a candidate abbreviation is valid only if it intersects every dictionary difference mask.
A final edge case involves abbreviations that compress the entire string into one number. For example, "5" for "apple". This abbreviation is extremely short, but it conflicts with every dictionary word of length 5. The algorithm properly rejects it whenever any dictionary difference mask fails to intersect the candidate mask.