LeetCode 2284 - Sender With Largest Word Count
The problem asks us to analyze a chat log containing n messages and identify which sender has written the most words overall. Each message is represented as a string in the messages array, and the corresponding sender of that message is at the same index in the senders array.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Counting
Solution
Problem Understanding
The problem asks us to analyze a chat log containing n messages and identify which sender has written the most words overall. Each message is represented as a string in the messages array, and the corresponding sender of that message is at the same index in the senders array. A word is defined as a sequence of letters separated by spaces, and we are guaranteed that messages do not have leading or trailing spaces, and words are separated by a single space.
The goal is to return the name of the sender who has the largest total word count across all their messages. If multiple senders have the same maximum word count, we must return the sender whose name is lexicographically largest, with uppercase letters considered smaller than lowercase letters.
The constraints indicate that n can be up to 10,000, each message is at most 100 characters, and sender names are at most 10 characters. This suggests that a solution using a hash map to track word counts per sender will be efficient enough.
Important edge cases include messages with the same word count among multiple senders, empty strings (not allowed here), and names differing only by case. The problem guarantees that each message has at least one word and every sender name is non-empty.
Approaches
The brute-force approach would be to iterate over every message and for each sender, count the words manually and keep a running tally. At the end, we would compare every sender's total word count to find the largest. While correct, this approach could become inefficient if the counting of words is repeated unnecessarily, but given the constraints, it is still feasible.
The optimal approach uses a hash map (dictionary in Python, map in Go) to store the cumulative word count for each sender. We iterate through the messages and senders arrays once, update the count for each sender, and finally determine the sender with the maximum word count. If there is a tie, we return the lexicographically largest name. This approach leverages the hash map for efficient lookups and updates, ensuring linear time complexity relative to the number of messages.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n * m) | O(n) | Count words for each message without storing cumulative counts efficiently |
| Optimal | O(n * m) | O(n) | Use hash map to store cumulative word counts per sender, iterate once for final selection |
Here, n is the number of messages and m is the maximum number of words in a message.
Algorithm Walkthrough
-
Initialize an empty hash map to track total word counts for each sender.
-
Iterate over all messages and corresponding senders:
-
Split the message string into words by spaces.
-
Count the number of words.
-
Add the word count to the sender’s total in the hash map, creating an entry if necessary.
-
Initialize two variables:
max_wordsto track the highest word count found andtop_senderto store the corresponding sender name. -
Iterate over all items in the hash map:
-
If a sender’s total word count exceeds
max_words, updatemax_wordsandtop_sender. -
If a sender’s total word count equals
max_words, compare the sender names lexicographically and updatetop_senderif the new sender is larger. -
Return
top_senderas the sender with the largest word count.
Why it works: The hash map maintains a running total of words per sender. By iterating through it at the end, we ensure that the largest total is found efficiently, and using lexicographical comparison handles ties correctly.
Python Solution
from typing import List
class Solution:
def largestWordCount(self, messages: List[str], senders: List[str]) -> str:
word_counts = {}
for msg, sender in zip(messages, senders):
count = len(msg.split())
if sender in word_counts:
word_counts[sender] += count
else:
word_counts[sender] = count
max_words = -1
top_sender = ""
for sender, count in word_counts.items():
if count > max_words or (count == max_words and sender > top_sender):
max_words = count
top_sender = sender
return top_sender
The code uses a dictionary word_counts to maintain the cumulative word count for each sender. It iterates through messages and senders in tandem using zip, counts words using split(), and updates the dictionary. Finally, it iterates through the dictionary to find the sender with the maximum word count, handling ties by lexicographical comparison.
Go Solution
import "strings"
func largestWordCount(messages []string, senders []string) string {
wordCounts := make(map[string]int)
for i, msg := range messages {
count := len(strings.Fields(msg))
wordCounts[senders[i]] += count
}
maxWords := -1
topSender := ""
for sender, count := range wordCounts {
if count > maxWords || (count == maxWords && sender > topSender) {
maxWords = count
topSender = sender
}
}
return topSender
}
In Go, we use strings.Fields to split the message into words. The map wordCounts stores cumulative word counts. The rest of the algorithm mirrors the Python version, including tie-breaking using lexicographical comparison.
Worked Examples
Example 1:
messages = ["Hello userTwooo","Hi userThree","Wonderful day Alice","Nice day userThree"]
senders = ["Alice","userTwo","userThree","Alice"]
| Iteration | Sender | Message | Word Count | word_counts |
|---|---|---|---|---|
| 0 | Alice | "Hello userTwooo" | 2 | {"Alice": 2} |
| 1 | userTwo | "Hi userThree" | 2 | {"Alice": 2, "userTwo": 2} |
| 2 | userThree | "Wonderful day Alice" | 3 | {"Alice": 2, "userTwo": 2, "userThree": 3} |
| 3 | Alice | "Nice day userThree" | 3 | {"Alice": 5, "userTwo": 2, "userThree": 3} |
Maximum word count is 5, so the result is "Alice".
Example 2:
messages = ["How is leetcode for everyone","Leetcode is useful for practice"]
senders = ["Bob","Charlie"]
| Sender | Total Words |
|---|---|
| Bob | 5 |
| Charlie | 5 |
Tie occurs, lexicographically larger is "Charlie".
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n * m) | We iterate over all n messages and split each message into m words. |
| Space | O(n) | Hash map stores counts for each sender. |
The linear time complexity is acceptable since n <= 10^4 and m <= 100. The space complexity is bounded by the number of unique senders.
Test Cases
# Provided examples
assert Solution().largestWordCount(
["Hello userTwooo","Hi userThree","Wonderful day Alice","Nice day userThree"],
["Alice","userTwo","userThree","Alice"]
) == "Alice" # Alice has 5 words
assert Solution().largestWordCount(
["How is leetcode for everyone","Leetcode is useful for practice"],
["Bob","Charlie"]
) == "Charlie" # Tie, Charlie is lexicographically larger
# Edge cases
assert Solution().largestWordCount(
["a"], ["a"]
) == "a" # Single message, single sender
assert Solution().largestWordCount(
["a b","c d"], ["alice","Alice"]
) == "alice" # Case-sensitive lexicographical comparison
assert Solution().largestWordCount(
["word"], ["Z"]
) == "Z" # Single message, uppercase name
| Test | Why |
|---|---|
| Single sender with multiple messages | Validates cumulative counting |
| Tie with lexicographical comparison | Checks correct tie-breaking logic |
| Case-sensitive sender names | Ensures lexicographical comparison respects case |
| Single message | Minimal input test |
| Uppercase vs lowercase names | Ensures correct ordering according to ASCII values |
Edge Cases
The first edge case is when multiple senders have the same maximum word count. In such scenarios, failing to handle the lexicographical comparison would return an incorrect sender. The implementation compares sender names directly when counts are equal.
The second edge case involves sender names with varying cases, such as "Alice" and "alice". Because uppercase letters have smaller ASCII values than lowercase letters, the implementation ensures the correct lexicographical comparison, respecting this ordering.
The third edge case is when messages contain only a single word. A naive implementation that assumes multiple words per message could fail, but using split() or strings.Fields handles single-word messages correctly and counts them as one word.