LeetCode 1348 - Tweet Counts Per Frequency
The problem is asking us to design a system that tracks tweets by their timestamp and allows querying the number of tweets in specific time intervals, split according to a frequency (minute, hour, or day).
Difficulty: 🟡 Medium
Topics: Hash Table, String, Binary Search, Design, Sorting, Ordered Set
Solution
Problem Understanding
The problem is asking us to design a system that tracks tweets by their timestamp and allows querying the number of tweets in specific time intervals, split according to a frequency (minute, hour, or day). Specifically, we need to implement a TweetCounts class with two operations: recording a tweet and retrieving tweet counts per frequency.
The input consists of tweetName strings and timestamps in seconds. When querying, we are given a frequency, a tweet name, and a start and end time. The output should be a list of integers, where each integer represents the number of tweets in one "time chunk" of the specified frequency. Time chunks are continuous, non-overlapping intervals that partition the given [startTime, endTime]. The last chunk may be shorter if the total period does not divide evenly by the chunk size.
Constraints tell us the timestamps can be very large (0 <= time <= 10^9) but the queried period difference is relatively small (0 <= endTime - startTime <= 10^4). There will be at most 10^4 calls to the API, which implies efficiency matters, but we do not need to handle extremely frequent operations in the millions.
Edge cases include querying a period with no tweets, recording multiple tweets at the same timestamp, and the last chunk being shorter than the standard frequency.
Approaches
Brute Force Approach: A naive approach would store all tweets in a simple list and, when queried, iterate through the list to count tweets that fall into each time chunk. This is correct but inefficient. For n recorded tweets and k chunks, this approach has O(n * k) time complexity per query, which could be too slow if many tweets exist.
Optimal Approach: A more efficient approach leverages a hash map to store the timestamps for each tweet name, maintaining the timestamps in sorted order. When querying, we can use binary search to quickly locate the range of timestamps within [startTime, endTime]. Then, we iterate through only relevant timestamps and count them per time chunk. This avoids iterating over all tweets and ensures that only timestamps in the relevant period are processed.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n * k) per query | O(n) | Iterates over all tweets for each query |
| Optimal | O(log n + m) per query | O(n) | Use hashmap + sorted list + binary search; m is number of tweets in query range |
Algorithm Walkthrough
- Initialize a
TweetCountsobject with a dictionary mapping tweet names to sorted lists of timestamps. - When
recordTweetis called, append the timestamp to the list for that tweet name. Insert in sorted order or sort lazily when querying. - In
getTweetCountsPerFrequency, determine thechunkSizebased on thefreqparameter (minute = 60,hour = 3600,day = 86400). - Use binary search to find the leftmost and rightmost indices in the timestamp list that fall within
[startTime, endTime]. - Initialize an array of zero counts with a length equal to the number of chunks:
(endTime - startTime) // chunkSize + 1. - Iterate over the relevant timestamps and compute the corresponding chunk index as
(timestamp - startTime) // chunkSize. Increment the count for that chunk. - Return the array of counts.
Why it works: Each timestamp is mapped to exactly one chunk using integer division. Binary search ensures we only iterate over timestamps in the queried range, maintaining efficiency. Sorted lists guarantee chunk calculations are correct.
Python Solution
from typing import List
from collections import defaultdict
import bisect
class TweetCounts:
def __init__(self):
self.tweets = defaultdict(list)
def recordTweet(self, tweetName: str, time: int) -> None:
bisect.insort(self.tweets[tweetName], time)
def getTweetCountsPerFrequency(self, freq: str, tweetName: str, startTime: int, endTime: int) -> List[int]:
if freq == "minute":
chunkSize = 60
elif freq == "hour":
chunkSize = 3600
elif freq == "day":
chunkSize = 86400
else:
raise ValueError("Invalid frequency")
timestamps = self.tweets.get(tweetName, [])
left = bisect.bisect_left(timestamps, startTime)
right = bisect.bisect_right(timestamps, endTime)
relevant_times = timestamps[left:right]
num_chunks = (endTime - startTime) // chunkSize + 1
counts = [0] * num_chunks
for time in relevant_times:
index = (time - startTime) // chunkSize
counts[index] += 1
return counts
The Python implementation uses a defaultdict of lists to store tweet timestamps. bisect.insort maintains the list in sorted order, and binary search via bisect_left and bisect_right efficiently narrows down relevant timestamps. Counting per chunk is straightforward using integer division.
Go Solution
package main
import (
"sort"
)
type TweetCounts struct {
tweets map[string][]int
}
func Constructor() TweetCounts {
return TweetCounts{
tweets: make(map[string][]int),
}
}
func (this *TweetCounts) RecordTweet(tweetName string, time int) {
arr := this.tweets[tweetName]
i := sort.SearchInts(arr, time)
arr = append(arr, 0)
copy(arr[i+1:], arr[i:])
arr[i] = time
this.tweets[tweetName] = arr
}
func (this *TweetCounts) GetTweetCountsPerFrequency(freq string, tweetName string, startTime int, endTime int) []int {
var chunkSize int
switch freq {
case "minute":
chunkSize = 60
case "hour":
chunkSize = 3600
case "day":
chunkSize = 86400
default:
panic("invalid frequency")
}
timestamps, ok := this.tweets[tweetName]
if !ok {
return []int{}
}
left := sort.Search(len(timestamps), func(i int) bool { return timestamps[i] >= startTime })
right := sort.Search(len(timestamps), func(i int) bool { return timestamps[i] > endTime })
relevant := timestamps[left:right]
numChunks := (endTime - startTime)/chunkSize + 1
counts := make([]int, numChunks)
for _, t := range relevant {
index := (t - startTime) / chunkSize
counts[index]++
}
return counts
}
In Go, we manually maintain a sorted slice for each tweet. sort.SearchInts is used to insert timestamps and for binary searches. Slice manipulations ensure sorted order is preserved.
Worked Examples
Example 1
Operations: recordTweet("tweet3", 0), recordTweet("tweet3", 60), recordTweet("tweet3", 10), getTweetCountsPerFrequency("minute","tweet3",0,59)
-
After recording:
tweet3timestamps:[0,10,60] -
Query minute chunks
[0,59]: -
chunk 0:
[0,59]→ timestamps0,10→ count 2 -
Output:
[2]
Example 2
Query: getTweetCountsPerFrequency("minute","tweet3",0,60)
-
Chunks:
[0,59]and[60,60] -
Relevant timestamps:
[0,10,60] -
Chunk mapping:
-
0 → 0//60 = 0
-
10 → 10//60 = 0
-
60 → 60//60 = 1
-
Counts:
[2,1]
Example 3
Query: getTweetCountsPerFrequency("hour","tweet3",0,210) after recordTweet("tweet3",120)
- Timestamps:
[0,10,60,120] - Chunk:
[0,210](single chunk) - Count: 4
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time (recordTweet) | O(n) | Insertion in sorted list costs O(n) using bisect/shift |
| Time (getTweetCountsPerFrequency) | O(log n + m) | Binary search O(log n) + iterate m timestamps in range |
| Space | O(n) | Store all timestamps for all tweets |
The efficiency gain comes from restricting iteration only to relevant timestamps per query, and binary search reduces unnecessary scanning.
Test Cases
# Provided examples
obj = TweetCounts()
obj.recordTweet("tweet3", 0)
obj.recordTweet("tweet3", 60)
obj.recordTweet("tweet3", 10)
assert obj.getTweetCountsPerFrequency("minute", "tweet3", 0, 59) == [2] # 0,10
assert obj.getTweetCountsPerFrequency("minute", "tweet3", 0, 60) == [2,1] # 0,10 | 60
obj.recordTweet("tweet3", 120)
assert obj.getTweetCountsPerFrequency("hour", "tweet3", 0, 210) == [4] # 0,10,