LeetCode 916 - Word Subsets
The problem asks us to determine which strings in words1 satisfy all character requirements imposed by every string in words2. A string b is considered a subset of another string a if every character in b appears in a at least as many times as it appears in b.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String
Solution
Problem Understanding
The problem asks us to determine which strings in words1 satisfy all character requirements imposed by every string in words2.
A string b is considered a subset of another string a if every character in b appears in a at least as many times as it appears in b. This includes repeated characters. For example, "eoo" requires one 'e' and two 'o' characters.
For each word in words1, we must check whether it satisfies every constraint from every word in words2. If it does, it is called a universal word and should be included in the result.
The input consists of:
words1, the list of candidate words we want to testwords2, the list of required character patterns
The output is a list containing all universal words from words1.
The constraints are important:
- Both arrays can contain up to
10^4words - Each word length is at most
10 - Only lowercase English letters are used
The short word length is significant because it means character frequency counting is cheap. However, the number of words can be large, so repeatedly comparing every pair of words would become expensive.
A naive implementation could easily become too slow if it checks every word in words1 against every word in words2 independently.
There are several important edge cases:
words2may contain overlapping requirements such as"e"and"ee"- Multiple words in
words2may require different characters - A required word may contain repeated letters such as
"ccc" - A candidate word may contain extra characters, which should not matter
- Very short words, including single-character strings, must still work correctly
The problem guarantees that all strings in words1 are unique, which simplifies result handling because duplicates do not need special treatment.
Approaches
Brute Force Approach
The most direct solution is to test every word in words1 against every word in words2.
For each candidate word:
- Build a frequency count of its characters
- For every word in
words2:
- Build another frequency count
- Verify that the candidate contains each required character with sufficient multiplicity
If the candidate passes every check, add it to the result.
This approach is correct because it explicitly validates every requirement independently. However, it repeats a large amount of work. If several words in words2 require the same characters, those requirements are recalculated over and over again.
With up to 10^4 words in each array, this repeated checking becomes inefficient.
Key Insight for the Optimal Solution
The critical observation is that all words in words2 can be merged into a single combined requirement.
Suppose words2 = ["e", "oo"].
A universal word only needs:
- at least one
'e' - at least two
'o'
There is no need to separately check both words once we know the maximum required count for every character.
Similarly:
"e"requirese:1"ee"requirese:2
The true combined requirement is simply e:2.
Therefore, instead of checking every word in words2 independently, we can compute one global frequency requirement array where each character stores the maximum frequency required across all words in words2.
Then every candidate word in words1 only needs to be checked once against this combined requirement.
This dramatically reduces redundant work.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n × m × 26) | O(26) | Rechecks every constraint repeatedly |
| Optimal | O((n + m) × 26) | O(26) | Merges all requirements into one frequency profile |
Here:
n = total characters across words1m = total characters across words2
Since each word length is at most 10, the complexity is effectively linear in the number of words.
Algorithm Walkthrough
- Create a frequency array representing the maximum required count for each character.
We use an array of size 26 because the input only contains lowercase English letters. Index 0 corresponds to 'a', index 1 to 'b', and so on.
2. Iterate through every word in words2.
For each word, compute its character frequency count. 3. Merge the current word's frequencies into the global requirement array.
For every character:
- keep the maximum count seen so far
- this represents the strongest requirement among all words
For example:
"e"givese:1"ee"givese:2
The merged requirement becomes e:2.
4. Iterate through every word in words1.
For each candidate word:
- compute its character frequency count
- compare it against the global requirement array
- Verify whether the candidate satisfies every requirement.
For every character:
- if the candidate frequency is smaller than the required frequency, reject the word immediately
- If all requirements are satisfied, append the word to the result list.
- Return the final result list.
Why it works
The algorithm works because the merged requirement array captures the strongest requirement for every character across all words in words2.
If a candidate word satisfies these maximum requirements, then it automatically satisfies every individual word in words2.
This is sufficient because subset relationships are determined independently for each character frequency.
Python Solution
from typing import List
class Solution:
def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
def count_chars(word: str) -> List[int]:
counts = [0] * 26
for ch in word:
counts[ord(ch) - ord('a')] += 1
return counts
required = [0] * 26
# Build the maximum requirement from words2
for word in words2:
word_counts = count_chars(word)
for i in range(26):
required[i] = max(required[i], word_counts[i])
result = []
# Check each word in words1
for word in words1:
word_counts = count_chars(word)
is_universal = True
for i in range(26):
if word_counts[i] < required[i]:
is_universal = False
break
if is_universal:
result.append(word)
return result
The implementation begins with a helper function named count_chars, which converts a word into a frequency array of size 26. This representation allows constant-time access to character counts.
The required array stores the merged requirements from all words in words2. For every word in words2, we compute its frequency count and update each character position with the maximum value seen so far.
After preprocessing the requirements, we iterate through each candidate word in words1. Its frequency array is compared against the global requirement array.
If the candidate lacks enough occurrences of any character, it is rejected immediately using an early break. Otherwise, it is added to the result list.
The final result contains all universal words.
Go Solution
func wordSubsets(words1 []string, words2 []string) []string {
countChars := func(word string) [26]int {
var counts [26]int
for _, ch := range word {
counts[ch-'a']++
}
return counts
}
var required [26]int
// Build maximum requirement from words2
for _, word := range words2 {
wordCounts := countChars(word)
for i := 0; i < 26; i++ {
if wordCounts[i] > required[i] {
required[i] = wordCounts[i]
}
}
}
result := []string{}
// Check each word in words1
for _, word := range words1 {
wordCounts := countChars(word)
isUniversal := true
for i := 0; i < 26; i++ {
if wordCounts[i] < required[i] {
isUniversal = false
break
}
}
if isUniversal {
result = append(result, word)
}
}
return result
}
The Go solution closely mirrors the Python implementation. One notable difference is the use of fixed-size arrays [26]int instead of slices for character frequencies. Since the alphabet size is constant, fixed arrays are efficient and avoid additional allocations.
Go strings are iterated using range, which produces runes. Because the input is guaranteed to contain only lowercase English letters, subtracting 'a' safely maps characters into indices from 0 to 25.
The result slice starts empty and grows dynamically with append.
Worked Examples
Example 1
Input:
words1 = ["amazon","apple","facebook","google","leetcode"]
words2 = ["e","o"]
First, build the combined requirement.
| Word | Frequency Requirement |
|---|---|
| "e" | e:1 |
| "o" | o:1 |
Merged requirement:
| Character | Required Count |
|---|---|
| e | 1 |
| o | 1 |
Now evaluate each word.
| Word | Contains e? | Contains o? | Universal? |
|---|---|---|---|
| amazon | No | Yes | No |
| apple | Yes | No | No |
| Yes | Yes | Yes | |
| Yes | Yes | Yes | |
| leetcode | Yes | Yes | Yes |
Final result:
["facebook","google","leetcode"]
Example 2
Input:
words1 = ["amazon","apple","facebook","google","leetcode"]
words2 = ["lc","eo"]
Frequency requirements:
| Word | Requirement |
|---|---|
| "lc" | l:1, c:1 |
| "eo" | e:1, o:1 |
Merged requirement:
| Character | Required Count |
|---|---|
| l | 1 |
| c | 1 |
| e | 1 |
| o | 1 |
Check each candidate.
| Word | Meets Requirement? |
|---|---|
| amazon | Missing l and c |
| apple | Missing c and o |
| Missing l | |
| Missing c | |
| leetcode | Yes |
Final result:
["leetcode"]
Example 3
Input:
words1 = ["acaac","cccbb","aacbb","caacc","bcbbb"]
words2 = ["c","cc","b"]
Requirements:
| Word | Requirement |
|---|---|
| "c" | c:1 |
| "cc" | c:2 |
| "b" | b:1 |
Merged requirement:
| Character | Required Count |
|---|---|
| c | 2 |
| b | 1 |
Check candidates.
| Word | c Count | b Count | Universal? |
|---|---|---|---|
| acaac | 2 | 0 | No |
| cccbb | 3 | 2 | Yes |
| aacbb | 1 | 2 | No |
| caacc | 3 | 0 | No |
| bcbbb | 1 | 4 | No |
Final result:
["cccbb"]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O((N + M) × 26) | Each word is processed once with fixed-size frequency comparisons |
| Space | O(26) | Only constant-sized frequency arrays are used |
Here:
Nis the total number of characters acrosswords1Mis the total number of characters acrosswords2
Because the alphabet size is fixed at 26 lowercase letters, all frequency operations are effectively constant time. The algorithm scales linearly with the total input size.
Test Cases
from typing import List
class Solution:
def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
def count_chars(word: str) -> List[int]:
counts = [0] * 26
for ch in word:
counts[ord(ch) - ord('a')] += 1
return counts
required = [0] * 26
for word in words2:
word_counts = count_chars(word)
for i in range(26):
required[i] = max(required[i], word_counts[i])
result = []
for word in words1:
word_counts = count_chars(word)
valid = True
for i in range(26):
if word_counts[i] < required[i]:
valid = False
break
if valid:
result.append(word)
return result
solution = Solution()
# Example 1
assert solution.wordSubsets(
["amazon","apple","facebook","google","leetcode"],
["e","o"]
) == ["facebook","google","leetcode"] # basic multiple requirements
# Example 2
assert solution.wordSubsets(
["amazon","apple","facebook","google","leetcode"],
["lc","eo"]
) == ["leetcode"] # combined character requirements
# Example 3
assert solution.wordSubsets(
["acaac","cccbb","aacbb","caacc","bcbbb"],
["c","cc","b"]
) == ["cccbb"] # repeated character requirement
# Single character exact match
assert solution.wordSubsets(
["a", "b", "c"],
["a"]
) == ["a"] # smallest valid input
# Multiple repeated letters
assert solution.wordSubsets(
["aabbcc", "abc", "abcc"],
["aabc"]
) == ["aabbcc"] # multiplicity matters
# No valid universal words
assert solution.wordSubsets(
["abc", "def"],
["zzz"]
) == [] # impossible requirement
# All words valid
assert solution.wordSubsets(
["abc", "abcd", "abcde"],
["a"]
) == ["abc", "abcd", "abcde"] # every word satisfies condition
# Requirement merging test
assert solution.wordSubsets(
["eeeeoo", "eeo", "eo"],
["e", "ee", "ooo"]
) == [] # merged max counts eliminate all
# Complex overlap
assert solution.wordSubsets(
["leetcode", "google", "facebook"],
["ec", "oc", "ceo"]
) == ["leetcode"] # overlapping requirements
| Test | Why |
|---|---|
| Basic multiple requirements | Verifies independent character requirements |
| Combined character requirements | Ensures merged constraints work |
| Repeated character requirement | Confirms multiplicity handling |
| Smallest valid input | Tests minimum boundary |
| Multiplicity matters | Ensures repeated letters are counted |
| Impossible requirement | Verifies empty result handling |
| Every word valid | Confirms broad acceptance |
| Requirement merging | Tests maximum-frequency merging logic |
| Complex overlap | Validates overlapping constraints |
Edge Cases
One important edge case occurs when words2 contains repeated requirements for the same character, such as ["e", "ee", "eee"]. A naive implementation might incorrectly sum these counts and require six 'e' characters. The correct behavior is to require only the maximum count, which is three. The implementation handles this correctly by storing the maximum frequency for each character.
Another important case involves repeated letters inside a single word, such as "ccc" or "aabb". A buggy implementation might only check whether a character exists rather than whether it appears enough times. This solution uses full frequency arrays, so multiplicity is handled precisely.
A third edge case occurs when no words satisfy the constraints. For example, if words2 = ["zzz"] and no candidate contains three 'z' characters, the result should be an empty list. The implementation naturally handles this because every candidate fails the frequency comparison and nothing is appended to the result.
A subtle case involves words that contain many extra characters. Extra characters should not invalidate a candidate. For example, "facebook" is still valid for the requirement "eo" even though it contains many unrelated letters. The algorithm only checks lower bounds on required frequencies, so extra characters are ignored correctly.
Finally, single-character inputs are important boundary cases. Arrays containing words like "a" or "b" verify that indexing logic and frequency counting work correctly even at minimal sizes.