LeetCode 2135 - Count Words Obtained After Adding a Letter

The problem requires us to determine how many strings in targetWords can be formed from strings in startWords through a specific transformation.

LeetCode Problem 2135

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Bit Manipulation, Sorting

Solution

Problem Understanding

The problem requires us to determine how many strings in targetWords can be formed from strings in startWords through a specific transformation. The transformation involves two operations: appending a single letter not already present in the string, and then rearranging all letters arbitrarily. Essentially, we are checking whether, by adding one new character and permuting the result, a target word can be produced from some starting word.

The input consists of two arrays of strings: startWords and targetWords, each containing lowercase English letters only, with no duplicates within a string. The output is a single integer representing the number of target words that can be obtained from any start word.

Constraints are important: there can be up to 50,000 words in each array, and each string has at most 26 letters. This suggests that a brute-force solution that checks every permutation is infeasible due to combinatorial explosion. Also, the unique-letter guarantee simplifies the problem because we do not need to consider repeated letters when performing operations.

Key edge cases include empty strings (although the constraints guarantee length ≥1), single-letter strings, and cases where the transformation is impossible due to overlapping letters.

Approaches

The brute-force approach would attempt to simulate the operation for each startWord by generating all possible strings obtained by adding one new letter and permuting the result, then checking if any match a targetWord. This approach is correct but inefficient, as generating all permutations of a string is factorial in length, which is prohibitively expensive for strings of length up to 26 and arrays of size 50,000.

The optimal approach leverages the fact that each string contains unique letters and uses bit masking to represent sets of letters efficiently. Each string can be encoded as a 26-bit integer where each bit represents the presence of a letter. For startWords, we store all bitmasks in a set. For each targetWord, we generate bitmasks by removing one letter at a time (since the operation involves adding a letter to get the target), and then check if the resulting mask exists in the start set. This is efficient because checking membership in a set is O(1), and generating up to 26 masks per target word is feasible.

Approach Time Complexity Space Complexity Notes
Brute Force O(S * T * 26!) O(S * 26!) Generate all possible permutations after adding a letter
Optimal O(S + 26 * T) O(S) Use bitmask representation for efficient checking

Algorithm Walkthrough

  1. Initialize an empty set start_masks to store bitmask representations of all startWords.
  2. For each word in startWords, convert it into a bitmask where each bit represents the presence of a letter. Add this mask to start_masks.
  3. Initialize a counter count to zero. This will track the number of targetWords that can be obtained.
  4. For each word in targetWords, convert it into a bitmask representation.
  5. For each letter in the targetWord, generate a new mask by removing that letter (i.e., subtract the bit corresponding to that letter from the target mask). This represents a possible startWord that could generate the target.
  6. If any of these generated masks exists in start_masks, increment count and break to the next target word.
  7. After processing all target words, return count.

Why it works: Each target word can only be obtained by adding one letter to a start word. By reversing the operation (removing one letter), we efficiently check all possible start word candidates using bitmasking. Since the mapping is unique and checking is O(1), this guarantees correctness.

Python Solution

from typing import List

class Solution:
    def wordCount(self, startWords: List[str], targetWords: List[str]) -> int:
        def word_to_mask(word: str) -> int:
            mask = 0
            for ch in word:
                mask |= 1 << (ord(ch) - ord('a'))
            return mask

        start_masks = {word_to_mask(word) for word in startWords}
        count = 0

        for word in targetWords:
            target_mask = word_to_mask(word)
            for ch in word:
                possible_start = target_mask & ~(1 << (ord(ch) - ord('a')))
                if possible_start in start_masks:
                    count += 1
                    break

        return count

The function word_to_mask converts a word into a bitmask, efficiently encoding all letters. start_masks stores all starting word masks for quick lookup. For each target word, removing each letter and checking membership ensures we only count valid transformations, avoiding unnecessary computation.

Go Solution

func wordCount(startWords []string, targetWords []string) int {
    wordToMask := func(word string) int {
        mask := 0
        for i := 0; i < len(word); i++ {
            mask |= 1 << (word[i] - 'a')
        }
        return mask
    }

    startMasks := make(map[int]struct{})
    for _, word := range startWords {
        startMasks[wordToMask(word)] = struct{}{}
    }

    count := 0
    for _, word := range targetWords {
        targetMask := wordToMask(word)
        for i := 0; i < len(word); i++ {
            possibleStart := targetMask & ^(1 << (word[i] - 'a'))
            if _, exists := startMasks[possibleStart]; exists {
                count++
                break
            }
        }
    }

    return count
}

In Go, the bitmask conversion and map usage are slightly different syntactically. We use an empty struct as the map value to minimize memory. Logic remains identical to the Python version.

Worked Examples

Example 1: startWords = ["ant","act","tack"], targetWords = ["tack","act","acti"]

  1. Convert startWords to masks: ant -> 0b111001, act -> 0b1011, tack -> 0b11101
  2. Target "tack" mask: 0b11101. Removing letters:
  • Remove 't' -> 0b101 -> not in start
  • Remove 'a' -> 0b1101 -> not in start
  • Remove 'c' -> 0b10101 -> not in start
  • Remove 'k' -> 0b1011 -> matches "act" → count += 1
  1. Target "act" mask: cannot remove a letter to match any start → skip
  2. Target "acti" mask: 0b11011. Removing letters:
  • Remove 'i' -> 0b1011 → matches "act" → count += 1

Result = 2

Complexity Analysis

Measure Complexity Explanation
Time O(S + 26 * T) Convert each start word to mask O(S) and for each target word, generate up to 26 masks and check O(1) membership
Space O(S) Store all start word masks

This complexity is feasible given constraints, as S and T can be up to 50,000.

Test Cases

# Provided examples
assert Solution().wordCount(["ant","act","tack"], ["tack","act","acti"]) == 2
assert Solution().wordCount(["ab","a"], ["abc","abcd"]) == 1
# Edge cases
assert Solution().wordCount(["a"], ["ab"]) == 1  # smallest strings
assert Solution().wordCount(["a"], ["a"]) == 0   # cannot add a letter to itself
assert Solution().wordCount(["xyz"], ["wxyz"]) == 1  # adding a missing letter
assert Solution().wordCount(["abc"], ["def"]) == 0   # impossible conversion
Test Why
["ant","act","tack"], ["tack","act","acti"] Standard example with multiple matches
["ab","a"], ["abc","abcd"] Only first target is achievable
["a"], ["ab"] Smallest strings, single addition
["a"], ["a"] Cannot add a letter that already exists
["xyz"], ["wxyz"] Adding a missing letter to form target
["abc"], ["def"] Impossible transformation

Edge Cases

Single-letter start and target: When startWords or targetWords have only one letter, adding any letter is trivial, but we must ensure we do not attempt to "add" the same letter, which is forbidden. The implementation correctly uses bitmasking and avoids this.

Target identical to start: If a target word is already in startWords, it cannot be counted, as the operation requires adding a letter. The algorithm handles this by generating masks with one letter removed; since removal of a non-existent letter does not create the same mask, the target is correctly skipped.

Maximum size inputs: With 50,000 words and up to 26 letters, brute-force permutation approaches fail. The bitmask approach remains efficient because it reduces string comparison to integer comparison and limits iterations to 26 per target. This ensures scalability.

This solution is robust, efficient, and handles all edge cases dictated by the problem constraints.