LeetCode 411 - Minimum Unique Word Abbreviation

This problem asks us to generate the shortest possible abbreviation for a given target word such that the abbreviation cannot also represent any word in the dictionary. An abbreviation replaces one or more non-adjacent substrings with their lengths.

LeetCode Problem 411

Difficulty: 🔴 Hard
Topics: Array, String, Backtracking, Bit Manipulation

Solution

LeetCode 411 - Minimum Unique Word Abbreviation

Problem Understanding

This problem asks us to generate the shortest possible abbreviation for a given target word such that the abbreviation cannot also represent any word in the dictionary.

An abbreviation replaces one or more non-adjacent substrings with their lengths. For example, the word "apple" can become:

  • "5" because the whole word is replaced
  • "a4" because "pple" is replaced
  • "1p3" because the first character is replaced, "p" is kept, and the last three characters are replaced

The important rule is that replaced sections cannot be adjacent. That means abbreviations like "s55n" are invalid because the two numeric parts would represent adjacent omitted regions.

The input consists of:

  • target, the word we want to abbreviate
  • dictionary, a list of words

The output must be:

  • A valid abbreviation of target
  • The shortest possible abbreviation
  • An abbreviation that does not also match any dictionary word

Two words share an abbreviation if the kept characters appear in the same positions and the omitted segments can account for the remaining characters.

The constraints are extremely important here:

  • target.length <= 21
  • log2(n) + m <= 21

The small target length strongly suggests that bitmasking over character positions is feasible. Since 2^21 is about two million, enumerating masks becomes possible with pruning and optimization.

Another key observation is that dictionary words with different lengths can never conflict with abbreviations of the target. Only dictionary words with the same length matter.

Important edge cases include:

  • Empty dictionary, where the shortest abbreviation is simply the length of the target
  • Dictionary words of different lengths, which should be ignored
  • Multiple shortest answers, where returning any valid one is acceptable
  • Target length of 1, where the answer may need to preserve the only character
  • Cases where every highly compressed abbreviation conflicts, forcing us to preserve more letters

Approaches

Brute Force Approach

A brute force solution would generate every possible abbreviation of the target string and test whether each abbreviation conflicts with any dictionary word.

Since each character can either be kept or omitted, there are 2^m possible masks for a target of length m.

For every mask:

  1. Construct the abbreviation
  2. Compare it against every dictionary word
  3. Determine whether the abbreviation could also represent that dictionary word

This works because it exhaustively checks all possibilities and guarantees that the shortest valid abbreviation is found.

However, the implementation becomes very expensive. Checking abbreviation equivalence directly for every generated abbreviation involves repeated parsing and matching operations.

The brute force complexity grows rapidly because:

  • There are 2^m masks
  • Each mask may need checking against up to n dictionary words
  • Each check may take O(m)

This leads to roughly O(2^m * n * m) time.

Key Insight for the Optimal Solution

The crucial insight is that an abbreviation is uniquely determined by the positions we keep.

Suppose we keep characters at certain positions and abbreviate everything else. A dictionary word conflicts with our abbreviation if all kept positions contain the same characters as the target.

This naturally leads to bitmasking.

For each dictionary word of the same length:

  • Compute a difference mask
  • A bit is 1 if the dictionary word differs from the target at that position

Now consider a candidate abbreviation mask:

  • 1 means we keep the character
  • 0 means we abbreviate it

The abbreviation distinguishes the target from a dictionary word if:

candidate_mask & difference_mask != 0

This means at least one kept position exposes a differing character.

So the problem becomes:

Find the mask with minimum abbreviation length such that it intersects every dictionary difference mask.

This transforms the problem into a compact bitmask search problem.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(2^m * n * m) O(m) Generates all abbreviations and explicitly validates them
Optimal O(2^m * n) O(n) Uses bitmask pruning and difference masks

Algorithm Walkthrough

Step 1: Filter Dictionary Words

Only dictionary words with the same length as the target can possibly share an abbreviation.

For example:

target = "apple"
dictionary = ["blade", "orange"]

Only "blade" matters because "orange" has length 6.

This immediately reduces unnecessary work.

Step 2: Build Difference Masks

For every remaining dictionary word:

  • Compare it with the target character by character
  • Set a bit whenever the characters differ

Example:

target = apple
word   = blade

Character comparison:

Position Target Word Different
0 a b yes
1 p l yes
2 p a yes
3 l d yes
4 e e no

Difference mask:

11110

This means positions 0,1,2,3 differ.

Step 3: Enumerate Candidate Masks

A candidate mask determines which positions are preserved.

Example:

00100

means:

  • Keep only index 2
  • Abbreviate everything else

For "apple" this becomes:

2p2

Step 4: Check Whether a Mask Is Unique

For every dictionary difference mask:

candidate_mask & difference_mask

If the result is zero, then every preserved position matches the dictionary word, meaning the abbreviation is ambiguous.

A valid abbreviation must satisfy:

(candidate_mask & diff) != 0

for every dictionary word.

Step 5: Compute Abbreviation Length

The abbreviation length is not the number of bits.

Consecutive omitted positions merge into one number.

For example:

mask = 10101

produces:

a1p1e

Length:

  • 3 letters
  • 2 numeric segments

Total = 5.

We compute this by scanning the mask and counting:

  • Preserved letters
  • Transitions into omitted regions

Step 6: Track the Best Mask

While enumerating masks:

  • Compute abbreviation length
  • Skip masks worse than the current best
  • Validate uniqueness
  • Store the best valid mask

Step 7: Convert Mask Into Abbreviation

Finally, reconstruct the abbreviation string:

  • Kept bits become letters
  • Consecutive omitted sections become counts

Example:

target = apple
mask   = 00100

Output:

2p2

Why it works

Every abbreviation corresponds exactly to a set of preserved positions. A dictionary word conflicts with the abbreviation only if it matches the target at every preserved position. Therefore, requiring every dictionary difference mask to intersect the candidate mask guarantees uniqueness. Enumerating all masks ensures we eventually examine the optimal solution, and minimizing abbreviation length guarantees the shortest valid abbreviation is returned.

Python Solution

from typing import List

class Solution:
    def minAbbreviation(self, target: str, dictionary: List[str]) -> str:
        n = len(target)

        dictionary = [word for word in dictionary if len(word) == n]

        if not dictionary:
            return str(n)

        diff_masks = []

        for word in dictionary:
            mask = 0

            for i in range(n):
                if target[i] != word[i]:
                    mask |= (1 << i)

            diff_masks.append(mask)

        def abbreviation_length(mask: int) -> int:
            length = 0
            i = 0

            while i < n:
                if mask & (1 << i):
                    length += 1
                    i += 1
                else:
                    length += 1
                    while i < n and not (mask & (1 << i)):
                        i += 1

            return length

        def is_unique(mask: int) -> bool:
            for diff in diff_masks:
                if (mask & diff) == 0:
                    return False
            return True

        best_mask = 0
        best_length = float("inf")

        for mask in range(1 << n):
            current_length = abbreviation_length(mask)

            if current_length >= best_length:
                continue

            if is_unique(mask):
                best_mask = mask
                best_length = current_length

        result = []
        count = 0

        for i in range(n):
            if best_mask & (1 << i):
                if count > 0:
                    result.append(str(count))
                    count = 0
                result.append(target[i])
            else:
                count += 1

        if count > 0:
            result.append(str(count))

        return "".join(result)

The implementation begins by filtering dictionary words to only those with the same length as the target. This is safe because abbreviations preserve total length structure.

Next, the code constructs difference masks. Each bit represents whether the dictionary word differs from the target at that position.

The abbreviation_length helper computes the true abbreviation length. Consecutive omitted positions form one numeric block, so the function counts transitions into omitted regions rather than individual omitted characters.

The is_unique helper checks whether the current mask distinguishes the target from every dictionary word. If any dictionary word has no overlap with the preserved positions, the abbreviation is ambiguous.

The main loop enumerates all masks from 0 to 2^n - 1. Masks that already exceed the current best abbreviation length are skipped early.

Finally, the best mask is converted into the final abbreviation string.

Go Solution

package main

import (
	"strconv"
	"strings"
)

func minAbbreviation(target string, dictionary []string) string {
	n := len(target)

	filtered := []string{}

	for _, word := range dictionary {
		if len(word) == n {
			filtered = append(filtered, word)
		}
	}

	if len(filtered) == 0 {
		return strconv.Itoa(n)
	}

	diffMasks := []int{}

	for _, word := range filtered {
		mask := 0

		for i := 0; i < n; i++ {
			if target[i] != word[i] {
				mask |= (1 << i)
			}
		}

		diffMasks = append(diffMasks, mask)
	}

	abbreviationLength := func(mask int) int {
		length := 0
		i := 0

		for i < n {
			if (mask & (1 << i)) != 0 {
				length++
				i++
			} else {
				length++

				for i < n && (mask&(1<<i)) == 0 {
					i++
				}
			}
		}

		return length
	}

	isUnique := func(mask int) bool {
		for _, diff := range diffMasks {
			if (mask & diff) == 0 {
				return false
			}
		}

		return true
	}

	bestMask := 0
	bestLength := 1 << 30

	for mask := 0; mask < (1 << n); mask++ {
		currentLength := abbreviationLength(mask)

		if currentLength >= bestLength {
			continue
		}

		if isUnique(mask) {
			bestMask = mask
			bestLength = currentLength
		}
	}

	var builder strings.Builder
	count := 0

	for i := 0; i < n; i++ {
		if (bestMask & (1 << i)) != 0 {
			if count > 0 {
				builder.WriteString(strconv.Itoa(count))
				count = 0
			}

			builder.WriteByte(target[i])
		} else {
			count++
		}
	}

	if count > 0 {
		builder.WriteString(strconv.Itoa(count))
	}

	return builder.String()
}

The Go implementation follows the same algorithmic structure as the Python version.

A strings.Builder is used for efficient string construction during abbreviation reconstruction. Bit operations are performed using standard integer masks, which is safe because the maximum target length is only 21 bits.

Go slices are used for storing filtered dictionary words and difference masks. Since Go does not have Python's dynamic closures with automatic variable capture behavior, helper functions are implemented as local anonymous functions.

Worked Examples

Example 1

target = "apple"
dictionary = ["blade"]

Step 1: Build Difference Mask

Compare "apple" and "blade":

Position Target Dictionary Different
0 a b yes
1 p l yes
2 p a yes
3 l d yes
4 e e no

Difference mask:

11110

Step 2: Try Candidate Masks

Mask Abbreviation Length Unique?
00000 5 1 No
00001 a4 2 Yes
10000 4e 2 No

The first valid shortest abbreviation is:

a4

Example 2

target = "apple"
dictionary = ["blade", "plain", "amber"]

Difference Masks

For "blade":

11110

For "plain":

10111

For "amber":

11011

Candidate Enumeration

Mask Abbreviation Valid?
00000 5 No
00001 a4 No
10000 4e No
00010 1p3 Yes

The answer becomes:

1p3

because the preserved 'p' distinguishes the target from every dictionary word.

Complexity Analysis

Measure Complexity Explanation
Time O(2^m * n) Enumerates all masks and validates against dictionary masks
Space O(n) Stores difference masks

The target length is at most 21, which makes exhaustive bitmask enumeration feasible. Each candidate mask is checked against all dictionary difference masks. The abbreviation length computation is O(m), but since m <= 21, it behaves like a small constant factor in practice.

Test Cases

from typing import List

class Solution:
    def minAbbreviation(self, target: str, dictionary: List[str]) -> str:
        n = len(target)

        dictionary = [word for word in dictionary if len(word) == n]

        if not dictionary:
            return str(n)

        diff_masks = []

        for word in dictionary:
            mask = 0

            for i in range(n):
                if target[i] != word[i]:
                    mask |= (1 << i)

            diff_masks.append(mask)

        def abbreviation_length(mask: int) -> int:
            length = 0
            i = 0

            while i < n:
                if mask & (1 << i):
                    length += 1
                    i += 1
                else:
                    length += 1
                    while i < n and not (mask & (1 << i)):
                        i += 1

            return length

        def is_unique(mask: int) -> bool:
            for diff in diff_masks:
                if (mask & diff) == 0:
                    return False
            return True

        best_mask = 0
        best_length = float("inf")

        for mask in range(1 << n):
            current_length = abbreviation_length(mask)

            if current_length >= best_length:
                continue

            if is_unique(mask):
                best_mask = mask
                best_length = current_length

        result = []
        count = 0

        for i in range(n):
            if best_mask & (1 << i):
                if count > 0:
                    result.append(str(count))
                    count = 0
                result.append(target[i])
            else:
                count += 1

        if count > 0:
            result.append(str(count))

        return "".join(result)

solution = Solution()

# Provided example 1
assert solution.minAbbreviation("apple", ["blade"]) == "a4"

# Provided example 2
assert solution.minAbbreviation("apple", ["blade", "plain", "amber"]) in {
    "1p3",
    "2p2",
    "3l1",
}

# Empty dictionary
assert solution.minAbbreviation("apple", []) == "5"

# Single character target
assert solution.minAbbreviation("a", ["b"]) == "1"

# Different length dictionary words ignored
assert solution.minAbbreviation("apple", ["orange", "hi"]) == "5"

# Need multiple preserved characters
result = solution.minAbbreviation("abcdef", ["abxdef", "abcxef"])
assert result != "6"

# All positions differ
assert solution.minAbbreviation("abc", ["xyz"]) == "1"

# Preserve ending character
result = solution.minAbbreviation("apple", ["apply"])
assert result in {"4e", "a4", "3l1"}

# Stress small max length behavior
result = solution.minAbbreviation(
    "abcdefghijklmnopqrstu",
    ["bbcdefghijklmnopqrstu"]
)
assert len(result) <= 2

Test Summary

Test Why
("apple", ["blade"]) Validates the first sample case
("apple", ["blade", "plain", "amber"]) Validates multiple dictionary conflicts
Empty dictionary Ensures immediate shortest abbreviation
Single character target Tests smallest valid target
Different length words Confirms filtering logic
Multiple preserved characters Ensures ambiguity handling
Completely different word Tests minimal distinguishing mask
Similar ending character Tests edge-position preservation
Maximum length target Stresses bitmask scaling

Edge Cases

One important edge case is an empty dictionary. In this scenario, no abbreviation can conflict with any other word, so the shortest possible abbreviation is simply the target length itself. A naive implementation might still attempt unnecessary searches, but this solution immediately returns str(n).

Another tricky case occurs when dictionary words have different lengths from the target. Such words can never share an abbreviation with the target because abbreviations preserve overall character count structure. The implementation safely filters these words out before processing.

A more subtle edge case happens when many abbreviations conflict and multiple preserved characters are required. For example, if every dictionary word differs at different positions, the solution may need to preserve several letters to uniquely identify the target. The bitmask intersection logic correctly handles this because a candidate abbreviation is valid only if it intersects every dictionary difference mask.

A final edge case involves abbreviations that compress the entire string into one number. For example, "5" for "apple". This abbreviation is extremely short, but it conflicts with every dictionary word of length 5. The algorithm properly rejects it whenever any dictionary difference mask fails to intersect the candidate mask.