LeetCode 1178 - Number of Valid Words for Each Puzzle

The problem asks us to determine, for each given puzzle string, how many words from a given list words are valid for that puzzle. A word is valid for a puzzle if it contains the first character of the puzzle and all characters of the word are also present in the puzzle.

LeetCode Problem 1178

Difficulty: 🔴 Hard
Topics: Array, Hash Table, String, Bit Manipulation, Trie

Solution

Problem Understanding

The problem asks us to determine, for each given puzzle string, how many words from a given list words are valid for that puzzle. A word is valid for a puzzle if it contains the first character of the puzzle and all characters of the word are also present in the puzzle. The input consists of a list of words and a list of puzzles, each exactly 7 letters long with unique characters. The output is an array where each element corresponds to the number of valid words for the respective puzzle.

Constraints indicate potentially large input sizes (words up to 10^5, puzzles up to 10^4), meaning a naive solution that checks each word against each puzzle (O(n*m) where n is number of words and m is number of puzzles) would be too slow. Edge cases include puzzles with no valid words, words that exceed 7 letters, or words sharing letters that are not in the puzzle.

Approaches

Brute Force

The brute-force solution would iterate through each puzzle, then check every word for the validity conditions: first letter must match, and all letters must exist in the puzzle. This approach is correct but inefficient because it requires O(n * m * k) operations where n is the number of words, m is the number of puzzles, and k is the average word length. Given constraints, this is not feasible.

Optimal

The optimal approach leverages bit manipulation and hash maps. Each word and puzzle can be represented as a bitmask of 26 bits, with each bit representing a letter. This allows quick checks for subset inclusion using bitwise operations.

Key insight: For a puzzle, we only need to consider subsets of its 6 letters (excluding the first letter), combine each subset with the first letter, and check how many words in the preprocessed word bitmask map match. This reduces complexity because there are at most 2^6 = 64 subsets per puzzle, which is manageable even with 10^4 puzzles.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * m * k) O(n * k) Check every word against every puzzle, too slow
Optimal O(n * k + m * 2^6) O(n) Use bitmask + hashmap for subset checks, very efficient

Algorithm Walkthrough

  1. Convert each word into a 26-bit integer bitmask representing its letters. Store a count of each unique bitmask in a hash map.
  2. For each puzzle, create a bitmask for all its letters.
  3. Generate all 2^6 subsets of the last 6 letters (excluding the first character). For each subset, add the bit of the first letter.
  4. For each subset bitmask, check if it exists in the word bitmask map and sum their counts.
  5. Append the sum to the answer array.

Why it works: Representing words and puzzles as bitmasks allows constant-time subset checks. Generating all subsets ensures all valid word combinations are counted while ensuring the first letter constraint is satisfied.

Python Solution

from typing import List
from collections import Counter
from itertools import combinations

class Solution:
    def findNumOfValidWords(self, words: List[str], puzzles: List[str]) -> List[int]:
        word_count = Counter()
        
        # Convert each word into a bitmask
        for word in words:
            mask = 0
            for ch in set(word):
                mask |= 1 << (ord(ch) - ord('a'))
            word_count[mask] += 1
        
        res = []
        for puzzle in puzzles:
            first = 1 << (ord(puzzle[0]) - ord('a'))
            total = 0
            # Generate all subsets of the last 6 letters
            mask_letters = [1 << (ord(ch) - ord('a')) for ch in puzzle[1:]]
            for k in range(0, 7):
                for combo in combinations(mask_letters, k):
                    mask = first
                    for c in combo:
                        mask |= c
                    total += word_count.get(mask, 0)
            res.append(total)
        return res

Explanation: The solution first precomputes the word bitmasks and stores their frequencies. For each puzzle, it iterates through all subsets of the last six letters and combines them with the first letter, checking the frequency map. The result is the sum of all valid matches for the puzzle.

Go Solution

package main

func findNumOfValidWords(words []string, puzzles []string) []int {
    wordCount := make(map[int]int)
    
    for _, word := range words {
        mask := 0
        seen := make(map[rune]bool)
        for _, ch := range word {
            if !seen[ch] {
                mask |= 1 << (ch - 'a')
                seen[ch] = true
            }
        }
        wordCount[mask]++
    }
    
    res := make([]int, len(puzzles))
    
    for i, puzzle := range puzzles {
        first := 1 << (puzzle[0] - 'a')
        var masks []int
        // generate masks of subsets of last 6 letters
        masks = append(masks, 0)
        for j := 1; j < 7; j++ {
            letterMask := 1 << (puzzle[j] - 'a')
            newMasks := make([]int, len(masks))
            copy(newMasks, masks)
            for _, m := range masks {
                newMasks = append(newMasks, m|letterMask)
            }
            masks = newMasks
        }
        count := 0
        for _, m := range masks {
            count += wordCount[m|first]
        }
        res[i] = count
    }
    
    return res
}

Explanation: Go implementation follows the same bitmask logic. A map tracks word frequencies, and subsets are generated iteratively. The first letter of each puzzle is always included in the mask check to satisfy the validity condition.

Worked Examples

Example: words = ["aaaa","asas","able","ability","actt","actor","access"], puzzles = ["aboveyz"]

  1. Convert words to bitmasks:
  • "aaaa" → 1 (a)
  • "asas" → 3 (a|s)
  • "able" → 15 (a|b|l|e)
  • "ability" → 1027 (a|b|i|l|t|y)
  • "actt" → 21 (a|c|t)
  • "actor" → 23 (a|c|t|o|r)
  • "access" → 23 (a|c|e|s)
  1. Puzzle "aboveyz", first letter bit = 1 (a)
  2. Generate subsets of remaining letters: "boveyz"
  3. Combine first letter with each subset and check word map. Only mask 1 matches → "aaaa"
  4. Result for this puzzle = 1

Complexity Analysis

Measure Complexity Explanation
Time O(n * k + m * 2^6) n is number of words, k is average word length, m is number of puzzles. 2^6 subsets per puzzle
Space O(n) Store frequency of word bitmasks in map

Bitmasking reduces the subset check from O(k) per word to O(1), making it feasible for large inputs.

Test Cases

# Example 1
assert Solution().findNumOfValidWords(
    ["aaaa","asas","able","ability","actt","actor","access"], 
    ["aboveyz","abrodyz","abslute","absoryz","actresz","gaswxyz"]
) == [1,1,3,2,4,0]

# Example 2
assert Solution().findNumOfValidWords(
    ["apple","pleas","please"], 
    ["aelwxyz","aelpxyz","aelpsxy","saelpxy","xaelpsy"]
) == [0,1,3,2,0]

# Edge cases
assert Solution().findNumOfValidWords(["aaaa"], ["aaaaaaa"]) == [1]  # word exactly subset
assert Solution().findNumOfValidWords(["aaaa"], ["bcdefgh"]) == [0]  # first letter missing
assert Solution().findNumOfValidWords([], ["abcdefg"]) == [0]        # no words
Test Why
Provided examples Validates standard behavior
Single word subset Ensures subset and first letter logic
Missing first letter Ensures first letter constraint is enforced
Empty word list Handles no input edge case

Edge Cases

  1. Word longer than puzzle: Words longer than 7 letters cannot be valid because all letters must be in the puzzle. The bitmask naturally handles this as such words include letters outside the puzzle mask.
  2. Puzzle with no valid words: Some puzzles may contain letters that do not appear in any word. Our algorithm still generates all subsets, but the hash map lookup will return 0 for non-existent masks.
  3. Words with duplicate letters: Duplicate letters in words do not affect validity because we convert each word into a set before creating the bitmask. This ensures duplicates do not inflate the count.