LeetCode 804 - Unique Morse Code Words

The problem asks us to determine how many unique Morse code transformations exist among a list of words. Each lowercase English letter maps to a specific Morse code representation.

LeetCode Problem 804

Difficulty: 🟢 Easy
Topics: Array, Hash Table, String

Solution

Problem Understanding

The problem asks us to determine how many unique Morse code transformations exist among a list of words.

Each lowercase English letter maps to a specific Morse code representation. A word can be transformed into Morse code by replacing each character with its Morse equivalent and concatenating the results together into a single string.

For example, suppose we have the word "cab":

  • 'c' becomes "-.-."
  • 'a' becomes ".-"
  • 'b' becomes "-..."

Concatenating these gives:

"-.-..--..."

The goal is not to count how many words there are, but rather how many distinct Morse transformations exist after converting every word.

For instance, two different words may produce the same Morse sequence. In Example 1:

  • "gin" becomes "--...-."
  • "zen" also becomes "--...-."

Although the words are different, they share the same Morse transformation, so they count as one unique result.

The input consists of an array words, where:

  • words.length ranges from 1 to 100
  • Each word length ranges from 1 to 12
  • Every word contains only lowercase English letters

These constraints are quite small. Even a relatively inefficient solution would likely run comfortably within limits. However, since the problem is categorized as Easy and naturally maps to a hash-based approach, we should aim for a clean and efficient implementation.

An important observation is that words may map to identical Morse strings, so we need a way to eliminate duplicates efficiently. A naive implementation that manually checks whether a transformation has already appeared could become unnecessarily inefficient.

Several edge cases are worth considering upfront. A single-word input should always return 1, because there is exactly one transformation. Multiple identical words should still count as one unique transformation. Different words may also accidentally map to the same Morse representation, which means comparing only original words would be incorrect. The problem guarantees valid lowercase letters only, so we do not need to handle invalid characters, uppercase letters, or empty strings.

Approaches

Brute Force Approach

A brute-force solution would first convert every word into its Morse transformation and store those transformed strings in a list. For each newly generated Morse string, we would scan the existing list to check whether that transformation already exists.

This works because we explicitly avoid duplicate transformations. Every word gets translated, and before adding the transformation, we verify whether it is already present.

However, checking for duplicates inside a list requires a linear scan. If there are n words, and we compare each transformation against previously seen transformations, the total runtime can grow quadratically.

Although the constraints are small enough that this would still pass, it is not the most efficient or elegant approach.

Key Insight

The key insight is that this problem is fundamentally about counting unique values.

Whenever we need uniqueness checking, a hash set is usually the ideal data structure because insertion and membership checks are approximately O(1).

Instead of manually searching through previous transformations, we can:

  1. Convert each word into Morse code.
  2. Insert the transformation into a set.
  3. Return the size of the set.

Since sets automatically remove duplicates, words that generate identical Morse strings naturally collapse into a single entry.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n² × m) O(n × m) Converts words and scans a list for duplicates
Optimal O(n × m) O(n × m) Uses a hash set to track unique transformations efficiently

Here:

  • n = number of words
  • m = average length of a word

Since each word length is at most 12, m is very small, but the optimal solution is still cleaner and more scalable.

Algorithm Walkthrough

  1. Create a Morse code lookup table for all 26 lowercase letters.

We are given the Morse mapping in order from 'a' to 'z'. Since letters are lowercase English characters, we can use character indexing to quickly find each Morse representation. 2. Initialize an empty hash set.

The purpose of the set is to automatically store only distinct Morse transformations. Duplicate transformations will not create additional entries. 3. Iterate through each word in words.

For every word, we need to compute its Morse transformation. 4. Convert the word into Morse code.

For each character:

  • Compute its alphabet index using:
ord(character) - ord('a')
  • Retrieve the Morse code from the lookup table.
  • Append it to a temporary string.
  1. Add the completed Morse transformation to the set.

If another word already produced the same transformation, the set automatically ignores duplicates. 6. After processing all words, return the size of the set.

The number of unique transformations is exactly the number of unique elements stored.

Why it works

The algorithm works because every word is deterministically transformed into exactly one Morse string. The hash set maintains the invariant that it contains every unique transformation encountered so far, without duplicates. Since every word is processed exactly once and every transformation is inserted into the set, the final set size must equal the number of distinct Morse code transformations.

Python Solution

from typing import List

class Solution:
    def uniqueMorseRepresentations(self, words: List[str]) -> int:
        morse_codes = [
            ".-", "-...", "-.-.", "-..", ".", "..-.", "--.", "....",
            "..", ".---", "-.-", ".-..", "--", "-.", "---", ".--.",
            "--.-", ".-.", "...", "-", "..-", "...-", ".--", "-..-",
            "-.--", "--.."
        ]

        unique_transformations = set()

        for word in words:
            morse_word = "".join(
                morse_codes[ord(char) - ord("a")]
                for char in word
            )
            unique_transformations.add(morse_word)

        return len(unique_transformations)

The implementation begins by storing the Morse representations for all lowercase English letters in a list. Because the list is ordered alphabetically, we can directly map a character to its Morse code using its alphabetical index.

We then initialize a set called unique_transformations. This set is responsible for eliminating duplicates automatically.

For every word in the input, we build its Morse transformation using a generator expression combined with "".join(). Each character is converted into an index with:

ord(char) - ord("a")

This gives values between 0 and 25, corresponding to letters 'a' through 'z'.

After generating the Morse string for a word, we insert it into the set. If another word produces the same Morse sequence, the set keeps only one copy.

Finally, we return the number of unique entries in the set using len().

Go Solution

func uniqueMorseRepresentations(words []string) int {
	morseCodes := []string{
		".-", "-...", "-.-.", "-..", ".", "..-.", "--.", "....",
		"..", ".---", "-.-", ".-..", "--", "-.", "---", ".--.",
		"--.-", ".-.", "...", "-", "..-", "...-", ".--", "-..-",
		"-.--", "--..",
	}

	uniqueTransformations := make(map[string]bool)

	for _, word := range words {
		morseWord := ""

		for _, char := range word {
			index := char - 'a'
			morseWord += morseCodes[index]
		}

		uniqueTransformations[morseWord] = true
	}

	return len(uniqueTransformations)
}

The Go implementation follows the same logic as the Python version. Since Go does not have a built-in set type, we simulate one using a map[string]bool.

Each Morse transformation is used as a key in the map. Assigning true simply marks the transformation as present.

One Go-specific detail is character handling. Iterating over a string with range produces runes, so we compute the alphabet index using:

index := char - 'a'

This works safely because the problem guarantees lowercase English letters.

Unlike Python, Go string concatenation inside loops can be less efficient for very large inputs. However, because each word length is capped at 12, this implementation is completely acceptable for the problem constraints.

Worked Examples

Example 1

Input:

words = ["gin","zen","gig","msg"]

We begin with:

unique_transformations = {}
Word Character Conversion Morse Transformation Set State
"gin" g -> "--.", i -> "..", n -> "-." "--...-." {"--...-."}
"zen" z -> "--..", e -> ".", n -> "-." "--...-." {"--...-."}
"gig" g -> "--.", i -> "..", g -> "--." "--...--." {"--...-.", "--...--."}
"msg" m -> "--", s -> "...", g -> "--." "--...--." {"--...-.", "--...--."}

At the end:

len(set) = 2

Output:

2

Example 2

Input:

words = ["a"]

We begin with:

unique_transformations = {}
Word Character Conversion Morse Transformation Set State
"a" a -> ".-" ".-" {".-"}

Final result:

len(set) = 1

Output:

1

Complexity Analysis

Measure Complexity Explanation
Time O(n × m) Each character in each word is processed once
Space O(n × m) The set stores Morse transformations

Let n be the number of words and m be the average word length.

Each word is traversed character by character exactly once, and converting each character to Morse code takes constant time. Therefore, total runtime is proportional to the total number of characters across all words, which is O(n × m).

The space complexity comes from storing unique Morse transformations in the set. In the worst case, every word produces a distinct transformation, requiring O(n × m) space to store all generated strings.

Test Cases

solution = Solution()

assert solution.uniqueMorseRepresentations(
    ["gin", "zen", "gig", "msg"]
) == 2  # Provided example with duplicate transformations

assert solution.uniqueMorseRepresentations(
    ["a"]
) == 1  # Single word input

assert solution.uniqueMorseRepresentations(
    ["a", "a", "a"]
) == 1  # Repeated identical words

assert solution.uniqueMorseRepresentations(
    ["abc", "abc"]
) == 1  # Duplicate words

assert solution.uniqueMorseRepresentations(
    ["gin", "zen"]
) == 1  # Different words, same Morse transformation

assert solution.uniqueMorseRepresentations(
    ["a", "b", "c"]
) == 3  # All unique transformations

assert solution.uniqueMorseRepresentations(
    ["z"]
) == 1  # Boundary character at end of alphabet

assert solution.uniqueMorseRepresentations(
    ["abcdefghijklmnopqrstuvwxyz"]
) == 1  # Long valid lowercase word

assert solution.uniqueMorseRepresentations(
    ["no", "on"]
) == 2  # Similar letters, different transformations
Test Why
["gin","zen","gig","msg"] Validates provided example and duplicate Morse sequences
["a"] Validates minimum input size
["a","a","a"] Ensures duplicate words count once
["abc","abc"] Confirms identical transformations are deduplicated
["gin","zen"] Verifies different words can map to same Morse code
["a","b","c"] Confirms multiple unique transformations are counted
["z"] Tests alphabet boundary indexing
["abcdefghijklmnopqrstuvwxyz"] Tests long valid word handling
["no","on"] Ensures order affects Morse transformation

Edge Cases

One important edge case is a single-word input. Since there is only one transformation possible, the answer must always be 1. A careless implementation might accidentally mishandle initialization or return 0 for empty structures, but our approach correctly inserts the transformation into the set and returns its size.

Another important case involves multiple identical words, such as ["a", "a", "a"]. Since every word transforms identically, the expected answer is still 1. This is exactly where a hash set is useful, because duplicate insertions do not increase the set size.

A more subtle edge case occurs when different words map to the same Morse transformation, such as "gin" and "zen". A naive implementation that counts unique words instead of unique Morse strings would fail here. Our algorithm correctly transforms words first and only tracks uniqueness at the Morse level.

Finally, the problem includes alphabet boundary characters, especially 'a' and 'z'. Since indexing depends on subtracting 'a', off-by-one errors are possible. Our implementation safely computes indices from 0 to 25, which align exactly with the Morse lookup table.