LeetCode 3295 - Report Spam Message

This problem asks us to determine whether a given message should be classified as spam. We are given two arrays of strings: - message, which contains the words appearing in the message. - bannedWords, which contains words that are considered banned.

LeetCode Problem 3295

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String

Solution

LeetCode 3295 - Report Spam Message

Problem Understanding

This problem asks us to determine whether a given message should be classified as spam.

We are given two arrays of strings:

  • message, which contains the words appearing in the message.
  • bannedWords, which contains words that are considered banned.

A message is considered spam if at least two words in message exactly match words contained in bannedWords.

The important detail is that we are counting occurrences within the message array. Every word in message that appears in bannedWords contributes to the count. As soon as the count reaches two, the message is spam and we should return true.

If fewer than two words from the message appear in the banned list, we return false.

The constraints are quite large:

  • message.length can be up to 10^5
  • bannedWords.length can be up to 10^5
  • Each word length is at most 15

These constraints immediately suggest that repeatedly scanning one array inside another would be too slow. We need a solution that can efficiently check whether a word belongs to the banned list.

Several edge cases are worth noting:

  • The message may contain exactly two banned words, which should return true.
  • The message may contain only one banned word, which should return false.
  • The same banned word may appear multiple times in the message. Each occurrence counts toward the total.
  • The banned list may contain many words that never appear in the message.
  • The message may have length 1, making it impossible to reach the required count of two.

Because matching is described as an exact match, partial matches do not count.

Approaches

Brute Force

A straightforward approach is to examine every word in message and check whether it exists in bannedWords by scanning the entire banned list.

For each message word, we compare it against every banned word until we either find a match or exhaust the list.

This approach is correct because it explicitly checks every possible membership relationship. However, it is inefficient. If both arrays contain 10^5 elements, the algorithm may perform up to 10^10 comparisons, which is far too slow.

Optimal Approach, Hash Set

The key observation is that the only operation we need is membership testing:

Is this message word contained in the banned word list?

A hash set provides average-case O(1) lookup time.

We can first insert all banned words into a hash set. Then we iterate through the message words and check membership in constant time. We maintain a counter of matched words. Once the counter reaches two, we immediately return true.

This avoids unnecessary work and scales efficiently to the largest allowed inputs.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n × m) O(1) Check every message word against every banned word
Optimal O(n + m) O(m) Store banned words in a hash set for constant-time lookup

Where:

  • n = len(message)
  • m = len(bannedWords)

Algorithm Walkthrough

  1. Create a hash set containing all words from bannedWords.
  2. Initialize a variable matches = 0.
  3. Iterate through every word in message.
  4. For each word, check whether it exists in the hash set.
  5. If it exists, increment matches.
  6. After incrementing, check whether matches >= 2.
  7. If the count has reached two, immediately return true because the message satisfies the spam condition.
  8. If the loop finishes without reaching two matches, return false.

Why it works

The hash set contains exactly the banned words. Therefore, a lookup correctly determines whether a message word is banned. The algorithm counts every banned-word occurrence in the message. A message is spam if and only if at least two such occurrences exist. Since the algorithm returns true precisely when the count reaches two and false otherwise, it always produces the correct result.

Python Solution

from typing import List

class Solution:
    def reportSpam(self, message: List[str], bannedWords: List[str]) -> bool:
        banned_set = set(bannedWords)
        matches = 0

        for word in message:
            if word in banned_set:
                matches += 1
                if matches >= 2:
                    return True

        return False

The implementation begins by converting bannedWords into a set. This is the crucial optimization because set membership checks are average-case O(1).

The variable matches keeps track of how many message words have been found in the banned set.

We then iterate through the message array. Whenever a word appears in the banned set, we increment the counter. As soon as the counter reaches two, we can immediately return True because the spam condition has already been satisfied.

If the entire message is processed and fewer than two matches are found, the function returns False.

Go Solution

func reportSpam(message []string, bannedWords []string) bool {
    bannedSet := make(map[string]struct{})

    for _, word := range bannedWords {
        bannedSet[word] = struct{}{}
    }

    matches := 0

    for _, word := range message {
        if _, exists := bannedSet[word]; exists {
            matches++
            if matches >= 2 {
                return true
            }
        }
    }

    return false
}

The Go solution uses a map[string]struct{} to represent a hash set. The empty struct occupies zero bytes, making it a common and memory-efficient choice for set implementations in Go.

Other than the data structure syntax, the logic is identical to the Python version. No special handling for integer overflow is required because the count can never exceed 10^5.

Worked Examples

Example 1

Input

message = ["hello", "world", "leetcode"]
bannedWords = ["world", "hello"]

First, build the set:

banned_set = {"hello", "world"}
Current Word In Set? Matches
hello Yes 1
world Yes 2

When matches becomes 2, we immediately return true.

Output

true

Example 2

Input

message = ["hello", "programming", "fun"]
bannedWords = ["world", "programming", "leetcode"]

Build the set:

banned_set = {"world", "programming", "leetcode"}
Current Word In Set? Matches
hello No 0
programming Yes 1
fun No 1

The loop finishes with only one match.

Output

false

Complexity Analysis

Measure Complexity Explanation
Time O(n + m) Building the set takes O(m), scanning the message takes O(n)
Space O(m) The hash set stores all banned words

The dominant operations are inserting all banned words into the set and performing one lookup per message word. Since hash set operations are average-case constant time, the total running time is linear in the combined input size.

Test Cases

sol = Solution()

assert sol.reportSpam(
    ["hello", "world", "leetcode"],
    ["world", "hello"]
) is True  # provided example, exactly two matches

assert sol.reportSpam(
    ["hello", "programming", "fun"],
    ["world", "programming", "leetcode"]
) is False  # provided example, only one match

assert sol.reportSpam(
    ["spam", "spam"],
    ["spam"]
) is True  # repeated banned word counts twice

assert sol.reportSpam(
    ["spam"],
    ["spam"]
) is False  # only one occurrence

assert sol.reportSpam(
    ["a", "b", "c"],
    ["x", "y", "z"]
) is False  # no matches

assert sol.reportSpam(
    ["a", "b", "c"],
    ["a", "b", "x"]
) is True  # first two words match

assert sol.reportSpam(
    ["a", "b", "c", "d"],
    ["c", "d"]
) is True  # matches occur later in message

assert sol.reportSpam(
    ["a", "a", "a"],
    ["a"]
) is True  # multiple occurrences of same banned word

assert sol.reportSpam(
    ["a", "b"],
    ["a"]
) is False  # exactly one banned occurrence

assert sol.reportSpam(
    ["x", "y"],
    ["x", "y"]
) is True  # both words banned

Test Summary

Test Why
Example 1 Validates basic positive case
Example 2 Validates basic negative case
Repeated banned word twice Confirms duplicate occurrences count separately
Single-element message Smallest message size
No matches Ensures false is returned correctly
First two words banned Early termination scenario
Matches at end Confirms full scan when needed
Multiple identical banned occurrences Tests repeated counting
Exactly one banned occurrence Verifies threshold behavior
Two banned words Simplest positive case

Edge Cases

Repeated Occurrences of the Same Banned Word

A common mistake is to count only distinct banned words. For example:

message = ["spam", "spam"]
bannedWords = ["spam"]

There is only one unique banned word, but there are two matching occurrences in the message. The problem asks for at least two words in the message that match banned words, so the correct answer is true. The implementation correctly increments the counter for every matching occurrence.

Message Length Equals One

Consider:

message = ["spam"]
bannedWords = ["spam"]

Even though the word is banned, there is only one matching message word. Since the threshold is two, the answer must be false. The implementation naturally handles this because the counter never reaches two.

Matches Occur Late in the Message

Consider:

message = ["a", "b", "c", "d"]
bannedWords = ["c", "d"]

The first two words are not banned, so a careless implementation that stops too early could fail. The algorithm continues scanning all message words until either two matches are found or the message ends. Therefore it correctly identifies the message as spam.

Large Inputs Near Constraint Limits

Both arrays can contain up to 100,000 elements. A nested-loop solution would be prohibitively slow. By using a hash set, each lookup remains efficient, and the overall complexity stays linear, allowing the solution to comfortably handle the maximum input sizes.