LeetCode 3089 - Find Bursty Behavior
The problem asks us to identify users who demonstrate "bursty behavior" in their posting patterns during February 2024.
Difficulty: 🟡 Medium
Topics: Database
Solution
Problem Understanding
The problem asks us to identify users who demonstrate "bursty behavior" in their posting patterns during February 2024. Specifically, a user exhibits bursty behavior if, within any consecutive 7-day period, the number of posts they make is at least twice their average weekly posting frequency in February 2024.
The input is a Posts table with columns post_id, user_id, and post_date. Each row represents a single post made by a user on a specific date. February 2024 is treated as exactly 28 days, divided into 4 weeks. The output is a table of user_id, max_7day_posts (the maximum number of posts within any 7-day window), and avg_weekly_posts (the average number of posts per week across February). Only users who satisfy the bursty behavior condition should be included, and the results are sorted by user_id.
Important constraints include the fixed length of February (28 days), which simplifies the calculation of average weekly posts. Edge cases to watch for include users with zero posts, users with only one post, or users who post consistently but never exceed twice their weekly average within any 7-day window.
Approaches
The brute-force approach would be to examine every possible 7-day window in February for each user, counting posts in each window and comparing it to twice the user's average weekly posts. While this guarantees correctness, it is inefficient because there are 22 possible 7-day windows per user (28 - 7 + 1) and potentially thousands of users, leading to excessive computations.
The optimal approach leverages windowed aggregation. First, compute the total number of posts per user in February and derive the average weekly posts. Next, generate all possible 7-day periods for February and count the posts per user in each period. Using these aggregates, we can efficiently compute the maximum 7-day post count per user and filter only those meeting the bursty behavior criterion. This approach avoids scanning each 7-day window separately for each post by using a combination of date sequences, joins, and group aggregations.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(U * W * P) | O(U * P) | Examine every 7-day window for each user and count posts. U = number of users, W = number of 7-day windows, P = number of posts. Inefficient for large datasets. |
| Optimal | O(U * P + U * W) | O(U * W) | Precompute weekly averages and sliding 7-day windows using SQL aggregates. Efficient for realistic dataset sizes. |
Algorithm Walkthrough
- Filter posts to include only dates in February 2024. This ensures that we are only considering posts relevant to the analysis.
- Compute each user’s total post count in February. Divide by 4 to calculate
avg_weekly_postssince February is treated as exactly 4 weeks. - Generate a list of all possible 7-day windows in February 2024. Each window starts on February 1 through February 22.
- For each user, count the number of posts in each 7-day window. This requires joining the user posts with the generated windows and counting posts that fall within the window.
- Compute the maximum post count (
max_7day_posts) for each user across all 7-day windows. - Filter users where
max_7day_postsis at least twice theiravg_weekly_posts. - Return the
user_id,max_7day_posts, andavg_weekly_posts, ordering byuser_id.
Why it works: By generating all 7-day windows and computing the maximum post count per user, we capture every possible period in which bursty behavior could occur. Comparing this maximum to twice the weekly average ensures the correct detection of bursty behavior.
Python Solution
import pandas as pd
from datetime import timedelta
def find_bursty_behavior(posts: pd.DataFrame) -> pd.DataFrame:
# Filter for February 2024
posts['post_date'] = pd.to_datetime(posts['post_date'])
feb_posts = posts[(posts['post_date'] >= '2024-02-01') & (posts['post_date'] <= '2024-02-28')]
# Compute total posts per user and avg weekly posts
user_totals = feb_posts.groupby('user_id').size().reset_index(name='total_posts')
user_totals['avg_weekly_posts'] = user_totals['total_posts'] / 4
# Generate 7-day windows
windows = pd.DataFrame({'start_date': pd.date_range('2024-02-01', '2024-02-22')})
results = []
for user_id, group in feb_posts.groupby('user_id'):
max_7day = 0
post_dates = group['post_date'].tolist()
for start in windows['start_date']:
end = start + timedelta(days=6)
count = sum(1 for d in post_dates if start <= d <= end)
max_7day = max(max_7day, count)
avg_weekly = user_totals.loc[user_totals['user_id'] == user_id, 'avg_weekly_posts'].values[0]
if max_7day >= 2 * avg_weekly:
results.append({'user_id': user_id, 'max_7day_posts': max_7day, 'avg_weekly_posts': round(avg_weekly, 4)})
return pd.DataFrame(results).sort_values('user_id').reset_index(drop=True)
The Python implementation first filters the posts to February 2024 and computes each user's average weekly posts. It then iterates through all possible 7-day windows, counts posts in each window, and records the maximum. Users who meet the bursty condition are included in the output.
Go Solution
package main
import (
"database/sql"
"fmt"
_ "github.com/go-sql-driver/mysql"
"time"
)
type UserBurst struct {
UserID int
Max7DayPosts int
AvgWeeklyPosts float64
}
func FindBurstyBehavior(db *sql.DB) ([]UserBurst, error) {
query := `
WITH feb_posts AS (
SELECT user_id, post_date
FROM Posts
WHERE post_date BETWEEN '2024-02-01' AND '2024-02-28'
),
user_totals AS (
SELECT user_id, COUNT(*) AS total_posts, COUNT(*) / 4.0 AS avg_weekly_posts
FROM feb_posts
GROUP BY user_id
),
windows AS (
SELECT DATE('2024-02-01') + INTERVAL n DAY AS start_date
FROM (SELECT 0 AS n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7
UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11
UNION ALL SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL SELECT 15
UNION ALL SELECT 16 UNION ALL SELECT 17 UNION ALL SELECT 18 UNION ALL SELECT 19
UNION ALL SELECT 20 UNION ALL SELECT 21 UNION ALL SELECT 22) AS nums
),
user_window_counts AS (
SELECT u.user_id, w.start_date, COUNT(f.post_date) AS cnt
FROM user_totals u
CROSS JOIN windows w
LEFT JOIN feb_posts f
ON f.user_id = u.user_id AND f.post_date BETWEEN w.start_date AND w.start_date + INTERVAL 6 DAY
GROUP BY u.user_id, w.start_date
),
user_max AS (
SELECT user_id, MAX(cnt) AS max_7day_posts
FROM user_window_counts
GROUP BY user_id
)
SELECT u.user_id, m.max_7day_posts, u.avg_weekly_posts
FROM user_totals u
JOIN user_max m ON u.user_id = m.user_id
WHERE m.max_7day_posts >= 2 * u.avg_weekly_posts
ORDER BY u.user_id;
`
rows, err := db.Query(query)
if err != nil {
return nil, err
}
defer rows.Close()
var results []UserBurst
for rows.Next() {
var ub UserBurst
if err := rows.Scan(&ub.UserID, &ub.Max7DayPosts, &ub.AvgWeeklyPosts); err != nil {
return nil, err
}
results = append(results, ub)
}
return results, nil
}
The Go solution uses SQL for aggregation and window generation. The logic mirrors the Python approach but leverages SQL CTEs for efficiency. It handles slices for the results and uses strong typing for numeric results.
Worked Examples
Using the example from the problem:
Posts Table:
| post_id | user_id | post_date |
|---------|---------|------------|
| 1 | 1 | 2024-02-27 |
| 2 | 5 | 2024-02-06 |
| 3 | 3 | 2024-02-25 |
| 4 | 3 | 2024-02-14 |
| 5 | 3 | 2024-02-06 |
| 6