LeetCode 3295 - Report Spam Message
This problem asks us to determine whether a given message should be classified as spam. We are given two arrays of strings: - message, which contains the words appearing in the message. - bannedWords, which contains words that are considered banned.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String
Solution
LeetCode 3295 - Report Spam Message
Problem Understanding
This problem asks us to determine whether a given message should be classified as spam.
We are given two arrays of strings:
message, which contains the words appearing in the message.bannedWords, which contains words that are considered banned.
A message is considered spam if at least two words in message exactly match words contained in bannedWords.
The important detail is that we are counting occurrences within the message array. Every word in message that appears in bannedWords contributes to the count. As soon as the count reaches two, the message is spam and we should return true.
If fewer than two words from the message appear in the banned list, we return false.
The constraints are quite large:
message.lengthcan be up to10^5bannedWords.lengthcan be up to10^5- Each word length is at most 15
These constraints immediately suggest that repeatedly scanning one array inside another would be too slow. We need a solution that can efficiently check whether a word belongs to the banned list.
Several edge cases are worth noting:
- The message may contain exactly two banned words, which should return
true. - The message may contain only one banned word, which should return
false. - The same banned word may appear multiple times in the message. Each occurrence counts toward the total.
- The banned list may contain many words that never appear in the message.
- The message may have length 1, making it impossible to reach the required count of two.
Because matching is described as an exact match, partial matches do not count.
Approaches
Brute Force
A straightforward approach is to examine every word in message and check whether it exists in bannedWords by scanning the entire banned list.
For each message word, we compare it against every banned word until we either find a match or exhaust the list.
This approach is correct because it explicitly checks every possible membership relationship. However, it is inefficient. If both arrays contain 10^5 elements, the algorithm may perform up to 10^10 comparisons, which is far too slow.
Optimal Approach, Hash Set
The key observation is that the only operation we need is membership testing:
Is this message word contained in the banned word list?
A hash set provides average-case O(1) lookup time.
We can first insert all banned words into a hash set. Then we iterate through the message words and check membership in constant time. We maintain a counter of matched words. Once the counter reaches two, we immediately return true.
This avoids unnecessary work and scales efficiently to the largest allowed inputs.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n × m) | O(1) | Check every message word against every banned word |
| Optimal | O(n + m) | O(m) | Store banned words in a hash set for constant-time lookup |
Where:
n = len(message)m = len(bannedWords)
Algorithm Walkthrough
- Create a hash set containing all words from
bannedWords. - Initialize a variable
matches = 0. - Iterate through every word in
message. - For each word, check whether it exists in the hash set.
- If it exists, increment
matches. - After incrementing, check whether
matches >= 2. - If the count has reached two, immediately return
truebecause the message satisfies the spam condition. - If the loop finishes without reaching two matches, return
false.
Why it works
The hash set contains exactly the banned words. Therefore, a lookup correctly determines whether a message word is banned. The algorithm counts every banned-word occurrence in the message. A message is spam if and only if at least two such occurrences exist. Since the algorithm returns true precisely when the count reaches two and false otherwise, it always produces the correct result.
Python Solution
from typing import List
class Solution:
def reportSpam(self, message: List[str], bannedWords: List[str]) -> bool:
banned_set = set(bannedWords)
matches = 0
for word in message:
if word in banned_set:
matches += 1
if matches >= 2:
return True
return False
The implementation begins by converting bannedWords into a set. This is the crucial optimization because set membership checks are average-case O(1).
The variable matches keeps track of how many message words have been found in the banned set.
We then iterate through the message array. Whenever a word appears in the banned set, we increment the counter. As soon as the counter reaches two, we can immediately return True because the spam condition has already been satisfied.
If the entire message is processed and fewer than two matches are found, the function returns False.
Go Solution
func reportSpam(message []string, bannedWords []string) bool {
bannedSet := make(map[string]struct{})
for _, word := range bannedWords {
bannedSet[word] = struct{}{}
}
matches := 0
for _, word := range message {
if _, exists := bannedSet[word]; exists {
matches++
if matches >= 2 {
return true
}
}
}
return false
}
The Go solution uses a map[string]struct{} to represent a hash set. The empty struct occupies zero bytes, making it a common and memory-efficient choice for set implementations in Go.
Other than the data structure syntax, the logic is identical to the Python version. No special handling for integer overflow is required because the count can never exceed 10^5.
Worked Examples
Example 1
Input
message = ["hello", "world", "leetcode"]
bannedWords = ["world", "hello"]
First, build the set:
banned_set = {"hello", "world"}
| Current Word | In Set? | Matches |
|---|---|---|
| hello | Yes | 1 |
| world | Yes | 2 |
When matches becomes 2, we immediately return true.
Output
true
Example 2
Input
message = ["hello", "programming", "fun"]
bannedWords = ["world", "programming", "leetcode"]
Build the set:
banned_set = {"world", "programming", "leetcode"}
| Current Word | In Set? | Matches |
|---|---|---|
| hello | No | 0 |
| programming | Yes | 1 |
| fun | No | 1 |
The loop finishes with only one match.
Output
false
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + m) | Building the set takes O(m), scanning the message takes O(n) |
| Space | O(m) | The hash set stores all banned words |
The dominant operations are inserting all banned words into the set and performing one lookup per message word. Since hash set operations are average-case constant time, the total running time is linear in the combined input size.
Test Cases
sol = Solution()
assert sol.reportSpam(
["hello", "world", "leetcode"],
["world", "hello"]
) is True # provided example, exactly two matches
assert sol.reportSpam(
["hello", "programming", "fun"],
["world", "programming", "leetcode"]
) is False # provided example, only one match
assert sol.reportSpam(
["spam", "spam"],
["spam"]
) is True # repeated banned word counts twice
assert sol.reportSpam(
["spam"],
["spam"]
) is False # only one occurrence
assert sol.reportSpam(
["a", "b", "c"],
["x", "y", "z"]
) is False # no matches
assert sol.reportSpam(
["a", "b", "c"],
["a", "b", "x"]
) is True # first two words match
assert sol.reportSpam(
["a", "b", "c", "d"],
["c", "d"]
) is True # matches occur later in message
assert sol.reportSpam(
["a", "a", "a"],
["a"]
) is True # multiple occurrences of same banned word
assert sol.reportSpam(
["a", "b"],
["a"]
) is False # exactly one banned occurrence
assert sol.reportSpam(
["x", "y"],
["x", "y"]
) is True # both words banned
Test Summary
| Test | Why |
|---|---|
| Example 1 | Validates basic positive case |
| Example 2 | Validates basic negative case |
| Repeated banned word twice | Confirms duplicate occurrences count separately |
| Single-element message | Smallest message size |
| No matches | Ensures false is returned correctly |
| First two words banned | Early termination scenario |
| Matches at end | Confirms full scan when needed |
| Multiple identical banned occurrences | Tests repeated counting |
| Exactly one banned occurrence | Verifies threshold behavior |
| Two banned words | Simplest positive case |
Edge Cases
Repeated Occurrences of the Same Banned Word
A common mistake is to count only distinct banned words. For example:
message = ["spam", "spam"]
bannedWords = ["spam"]
There is only one unique banned word, but there are two matching occurrences in the message. The problem asks for at least two words in the message that match banned words, so the correct answer is true. The implementation correctly increments the counter for every matching occurrence.
Message Length Equals One
Consider:
message = ["spam"]
bannedWords = ["spam"]
Even though the word is banned, there is only one matching message word. Since the threshold is two, the answer must be false. The implementation naturally handles this because the counter never reaches two.
Matches Occur Late in the Message
Consider:
message = ["a", "b", "c", "d"]
bannedWords = ["c", "d"]
The first two words are not banned, so a careless implementation that stops too early could fail. The algorithm continues scanning all message words until either two matches are found or the message ends. Therefore it correctly identifies the message as spam.
Large Inputs Near Constraint Limits
Both arrays can contain up to 100,000 elements. A nested-loop solution would be prohibitively slow. By using a hash set, each lookup remains efficient, and the overall complexity stays linear, allowing the solution to comfortably handle the maximum input sizes.