LeetCode 2284 - Sender With Largest Word Count

The problem asks us to analyze a chat log containing n messages and identify which sender has written the most words overall. Each message is represented as a string in the messages array, and the corresponding sender of that message is at the same index in the senders array.

LeetCode Problem 2284

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Counting

Solution

Problem Understanding

The problem asks us to analyze a chat log containing n messages and identify which sender has written the most words overall. Each message is represented as a string in the messages array, and the corresponding sender of that message is at the same index in the senders array. A word is defined as a sequence of letters separated by spaces, and we are guaranteed that messages do not have leading or trailing spaces, and words are separated by a single space.

The goal is to return the name of the sender who has the largest total word count across all their messages. If multiple senders have the same maximum word count, we must return the sender whose name is lexicographically largest, with uppercase letters considered smaller than lowercase letters.

The constraints indicate that n can be up to 10,000, each message is at most 100 characters, and sender names are at most 10 characters. This suggests that a solution using a hash map to track word counts per sender will be efficient enough.

Important edge cases include messages with the same word count among multiple senders, empty strings (not allowed here), and names differing only by case. The problem guarantees that each message has at least one word and every sender name is non-empty.

Approaches

The brute-force approach would be to iterate over every message and for each sender, count the words manually and keep a running tally. At the end, we would compare every sender's total word count to find the largest. While correct, this approach could become inefficient if the counting of words is repeated unnecessarily, but given the constraints, it is still feasible.

The optimal approach uses a hash map (dictionary in Python, map in Go) to store the cumulative word count for each sender. We iterate through the messages and senders arrays once, update the count for each sender, and finally determine the sender with the maximum word count. If there is a tie, we return the lexicographically largest name. This approach leverages the hash map for efficient lookups and updates, ensuring linear time complexity relative to the number of messages.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * m) O(n) Count words for each message without storing cumulative counts efficiently
Optimal O(n * m) O(n) Use hash map to store cumulative word counts per sender, iterate once for final selection

Here, n is the number of messages and m is the maximum number of words in a message.

Algorithm Walkthrough

  1. Initialize an empty hash map to track total word counts for each sender.

  2. Iterate over all messages and corresponding senders:

  3. Split the message string into words by spaces.

  4. Count the number of words.

  5. Add the word count to the sender’s total in the hash map, creating an entry if necessary.

  6. Initialize two variables: max_words to track the highest word count found and top_sender to store the corresponding sender name.

  7. Iterate over all items in the hash map:

  8. If a sender’s total word count exceeds max_words, update max_words and top_sender.

  9. If a sender’s total word count equals max_words, compare the sender names lexicographically and update top_sender if the new sender is larger.

  10. Return top_sender as the sender with the largest word count.

Why it works: The hash map maintains a running total of words per sender. By iterating through it at the end, we ensure that the largest total is found efficiently, and using lexicographical comparison handles ties correctly.

Python Solution

from typing import List

class Solution:
    def largestWordCount(self, messages: List[str], senders: List[str]) -> str:
        word_counts = {}
        
        for msg, sender in zip(messages, senders):
            count = len(msg.split())
            if sender in word_counts:
                word_counts[sender] += count
            else:
                word_counts[sender] = count
        
        max_words = -1
        top_sender = ""
        
        for sender, count in word_counts.items():
            if count > max_words or (count == max_words and sender > top_sender):
                max_words = count
                top_sender = sender
        
        return top_sender

The code uses a dictionary word_counts to maintain the cumulative word count for each sender. It iterates through messages and senders in tandem using zip, counts words using split(), and updates the dictionary. Finally, it iterates through the dictionary to find the sender with the maximum word count, handling ties by lexicographical comparison.

Go Solution

import "strings"

func largestWordCount(messages []string, senders []string) string {
    wordCounts := make(map[string]int)
    
    for i, msg := range messages {
        count := len(strings.Fields(msg))
        wordCounts[senders[i]] += count
    }
    
    maxWords := -1
    topSender := ""
    
    for sender, count := range wordCounts {
        if count > maxWords || (count == maxWords && sender > topSender) {
            maxWords = count
            topSender = sender
        }
    }
    
    return topSender
}

In Go, we use strings.Fields to split the message into words. The map wordCounts stores cumulative word counts. The rest of the algorithm mirrors the Python version, including tie-breaking using lexicographical comparison.

Worked Examples

Example 1:

messages = ["Hello userTwooo","Hi userThree","Wonderful day Alice","Nice day userThree"]
senders = ["Alice","userTwo","userThree","Alice"]
Iteration Sender Message Word Count word_counts
0 Alice "Hello userTwooo" 2 {"Alice": 2}
1 userTwo "Hi userThree" 2 {"Alice": 2, "userTwo": 2}
2 userThree "Wonderful day Alice" 3 {"Alice": 2, "userTwo": 2, "userThree": 3}
3 Alice "Nice day userThree" 3 {"Alice": 5, "userTwo": 2, "userThree": 3}

Maximum word count is 5, so the result is "Alice".

Example 2:

messages = ["How is leetcode for everyone","Leetcode is useful for practice"]
senders = ["Bob","Charlie"]
Sender Total Words
Bob 5
Charlie 5

Tie occurs, lexicographically larger is "Charlie".

Complexity Analysis

Measure Complexity Explanation
Time O(n * m) We iterate over all n messages and split each message into m words.
Space O(n) Hash map stores counts for each sender.

The linear time complexity is acceptable since n <= 10^4 and m <= 100. The space complexity is bounded by the number of unique senders.

Test Cases

# Provided examples
assert Solution().largestWordCount(
    ["Hello userTwooo","Hi userThree","Wonderful day Alice","Nice day userThree"],
    ["Alice","userTwo","userThree","Alice"]
) == "Alice"  # Alice has 5 words

assert Solution().largestWordCount(
    ["How is leetcode for everyone","Leetcode is useful for practice"],
    ["Bob","Charlie"]
) == "Charlie"  # Tie, Charlie is lexicographically larger

# Edge cases
assert Solution().largestWordCount(
    ["a"], ["a"]
) == "a"  # Single message, single sender

assert Solution().largestWordCount(
    ["a b","c d"], ["alice","Alice"]
) == "alice"  # Case-sensitive lexicographical comparison

assert Solution().largestWordCount(
    ["word"], ["Z"]
) == "Z"  # Single message, uppercase name
Test Why
Single sender with multiple messages Validates cumulative counting
Tie with lexicographical comparison Checks correct tie-breaking logic
Case-sensitive sender names Ensures lexicographical comparison respects case
Single message Minimal input test
Uppercase vs lowercase names Ensures correct ordering according to ASCII values

Edge Cases

The first edge case is when multiple senders have the same maximum word count. In such scenarios, failing to handle the lexicographical comparison would return an incorrect sender. The implementation compares sender names directly when counts are equal.

The second edge case involves sender names with varying cases, such as "Alice" and "alice". Because uppercase letters have smaller ASCII values than lowercase letters, the implementation ensures the correct lexicographical comparison, respecting this ordering.

The third edge case is when messages contain only a single word. A naive implementation that assumes multiple words per message could fail, but using split() or strings.Fields handles single-word messages correctly and counts them as one word.