LeetCode 1348 - Tweet Counts Per Frequency

The problem is asking us to design a system that tracks tweets by their timestamp and allows querying the number of tweets in specific time intervals, split according to a frequency (minute, hour, or day).

LeetCode Problem 1348

Difficulty: 🟡 Medium
Topics: Hash Table, String, Binary Search, Design, Sorting, Ordered Set

Solution

Problem Understanding

The problem is asking us to design a system that tracks tweets by their timestamp and allows querying the number of tweets in specific time intervals, split according to a frequency (minute, hour, or day). Specifically, we need to implement a TweetCounts class with two operations: recording a tweet and retrieving tweet counts per frequency.

The input consists of tweetName strings and timestamps in seconds. When querying, we are given a frequency, a tweet name, and a start and end time. The output should be a list of integers, where each integer represents the number of tweets in one "time chunk" of the specified frequency. Time chunks are continuous, non-overlapping intervals that partition the given [startTime, endTime]. The last chunk may be shorter if the total period does not divide evenly by the chunk size.

Constraints tell us the timestamps can be very large (0 <= time <= 10^9) but the queried period difference is relatively small (0 <= endTime - startTime <= 10^4). There will be at most 10^4 calls to the API, which implies efficiency matters, but we do not need to handle extremely frequent operations in the millions.

Edge cases include querying a period with no tweets, recording multiple tweets at the same timestamp, and the last chunk being shorter than the standard frequency.

Approaches

Brute Force Approach: A naive approach would store all tweets in a simple list and, when queried, iterate through the list to count tweets that fall into each time chunk. This is correct but inefficient. For n recorded tweets and k chunks, this approach has O(n * k) time complexity per query, which could be too slow if many tweets exist.

Optimal Approach: A more efficient approach leverages a hash map to store the timestamps for each tweet name, maintaining the timestamps in sorted order. When querying, we can use binary search to quickly locate the range of timestamps within [startTime, endTime]. Then, we iterate through only relevant timestamps and count them per time chunk. This avoids iterating over all tweets and ensures that only timestamps in the relevant period are processed.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * k) per query O(n) Iterates over all tweets for each query
Optimal O(log n + m) per query O(n) Use hashmap + sorted list + binary search; m is number of tweets in query range

Algorithm Walkthrough

  1. Initialize a TweetCounts object with a dictionary mapping tweet names to sorted lists of timestamps.
  2. When recordTweet is called, append the timestamp to the list for that tweet name. Insert in sorted order or sort lazily when querying.
  3. In getTweetCountsPerFrequency, determine the chunkSize based on the freq parameter (minute = 60, hour = 3600, day = 86400).
  4. Use binary search to find the leftmost and rightmost indices in the timestamp list that fall within [startTime, endTime].
  5. Initialize an array of zero counts with a length equal to the number of chunks: (endTime - startTime) // chunkSize + 1.
  6. Iterate over the relevant timestamps and compute the corresponding chunk index as (timestamp - startTime) // chunkSize. Increment the count for that chunk.
  7. Return the array of counts.

Why it works: Each timestamp is mapped to exactly one chunk using integer division. Binary search ensures we only iterate over timestamps in the queried range, maintaining efficiency. Sorted lists guarantee chunk calculations are correct.

Python Solution

from typing import List
from collections import defaultdict
import bisect

class TweetCounts:

    def __init__(self):
        self.tweets = defaultdict(list)

    def recordTweet(self, tweetName: str, time: int) -> None:
        bisect.insort(self.tweets[tweetName], time)

    def getTweetCountsPerFrequency(self, freq: str, tweetName: str, startTime: int, endTime: int) -> List[int]:
        if freq == "minute":
            chunkSize = 60
        elif freq == "hour":
            chunkSize = 3600
        elif freq == "day":
            chunkSize = 86400
        else:
            raise ValueError("Invalid frequency")

        timestamps = self.tweets.get(tweetName, [])
        left = bisect.bisect_left(timestamps, startTime)
        right = bisect.bisect_right(timestamps, endTime)
        relevant_times = timestamps[left:right]

        num_chunks = (endTime - startTime) // chunkSize + 1
        counts = [0] * num_chunks

        for time in relevant_times:
            index = (time - startTime) // chunkSize
            counts[index] += 1

        return counts

The Python implementation uses a defaultdict of lists to store tweet timestamps. bisect.insort maintains the list in sorted order, and binary search via bisect_left and bisect_right efficiently narrows down relevant timestamps. Counting per chunk is straightforward using integer division.

Go Solution

package main

import (
	"sort"
)

type TweetCounts struct {
	tweets map[string][]int
}

func Constructor() TweetCounts {
	return TweetCounts{
		tweets: make(map[string][]int),
	}
}

func (this *TweetCounts) RecordTweet(tweetName string, time int) {
	arr := this.tweets[tweetName]
	i := sort.SearchInts(arr, time)
	arr = append(arr, 0)
	copy(arr[i+1:], arr[i:])
	arr[i] = time
	this.tweets[tweetName] = arr
}

func (this *TweetCounts) GetTweetCountsPerFrequency(freq string, tweetName string, startTime int, endTime int) []int {
	var chunkSize int
	switch freq {
	case "minute":
		chunkSize = 60
	case "hour":
		chunkSize = 3600
	case "day":
		chunkSize = 86400
	default:
		panic("invalid frequency")
	}

	timestamps, ok := this.tweets[tweetName]
	if !ok {
		return []int{}
	}

	left := sort.Search(len(timestamps), func(i int) bool { return timestamps[i] >= startTime })
	right := sort.Search(len(timestamps), func(i int) bool { return timestamps[i] > endTime })

	relevant := timestamps[left:right]
	numChunks := (endTime - startTime)/chunkSize + 1
	counts := make([]int, numChunks)

	for _, t := range relevant {
		index := (t - startTime) / chunkSize
		counts[index]++
	}

	return counts
}

In Go, we manually maintain a sorted slice for each tweet. sort.SearchInts is used to insert timestamps and for binary searches. Slice manipulations ensure sorted order is preserved.

Worked Examples

Example 1

Operations: recordTweet("tweet3", 0), recordTweet("tweet3", 60), recordTweet("tweet3", 10), getTweetCountsPerFrequency("minute","tweet3",0,59)

  • After recording: tweet3 timestamps: [0,10,60]

  • Query minute chunks [0,59]:

  • chunk 0: [0,59] → timestamps 0,10 → count 2

  • Output: [2]

Example 2

Query: getTweetCountsPerFrequency("minute","tweet3",0,60)

  • Chunks: [0,59] and [60,60]

  • Relevant timestamps: [0,10,60]

  • Chunk mapping:

  • 0 → 0//60 = 0

  • 10 → 10//60 = 0

  • 60 → 60//60 = 1

  • Counts: [2,1]

Example 3

Query: getTweetCountsPerFrequency("hour","tweet3",0,210) after recordTweet("tweet3",120)

  • Timestamps: [0,10,60,120]
  • Chunk: [0,210] (single chunk)
  • Count: 4

Complexity Analysis

Measure Complexity Explanation
Time (recordTweet) O(n) Insertion in sorted list costs O(n) using bisect/shift
Time (getTweetCountsPerFrequency) O(log n + m) Binary search O(log n) + iterate m timestamps in range
Space O(n) Store all timestamps for all tweets

The efficiency gain comes from restricting iteration only to relevant timestamps per query, and binary search reduces unnecessary scanning.

Test Cases

# Provided examples
obj = TweetCounts()
obj.recordTweet("tweet3", 0)
obj.recordTweet("tweet3", 60)
obj.recordTweet("tweet3", 10)
assert obj.getTweetCountsPerFrequency("minute", "tweet3", 0, 59) == [2] # 0,10
assert obj.getTweetCountsPerFrequency("minute", "tweet3", 0, 60) == [2,1] # 0,10 | 60
obj.recordTweet("tweet3", 120)
assert obj.getTweetCountsPerFrequency("hour", "tweet3", 0, 210) == [4] # 0,10,