LeetCode 1141 - User Activity for the Past 30 Days I
This problem asks us to compute the number of distinct active users for each day within a fixed 30 day window. The window ends on 2019-07-27, inclusive, which means we only consider activity dates from 2019-06-28 through 2019-07-27.
Difficulty: 🟢 Easy
Topics: Database
Solution
LeetCode 1141 - User Activity for the Past 30 Days I
Problem Understanding
This problem asks us to compute the number of distinct active users for each day within a fixed 30 day window. The window ends on 2019-07-27, inclusive, which means we only consider activity dates from 2019-06-28 through 2019-07-27.
The Activity table records user interactions on a social media platform. Each row represents a single action performed by a user during a session. The table contains four columns:
| Column | Meaning |
|---|---|
user_id |
The user who performed the action |
session_id |
The session in which the action occurred |
activity_date |
The date of the activity |
activity_type |
The type of action performed |
The important detail is that any activity counts as valid activity. Whether the action is opening a session, scrolling, sending a message, or ending a session, the user should be considered active on that day.
The output should contain one row per day that has at least one active user. For each such day, we must return:
| Column | Meaning |
|---|---|
day |
The activity date |
active_users |
The number of distinct users active on that date |
The phrase "distinct users" is critical. A user may generate multiple rows on the same day, but they should only be counted once for that date.
The table may contain duplicate rows, which means a naive counting strategy using COUNT(*) would overcount users. We must instead use COUNT(DISTINCT user_id) grouped by date.
The constraints are small enough that a straightforward grouping solution is sufficient. The main challenge is correctly filtering the date range and handling duplicate activities from the same user.
Several edge cases are important:
- A user may perform many activities on the same day. They must still count as only one active user.
- Duplicate rows may exist in the table. These duplicates should not inflate the user count.
- Activities outside the 30 day window must be ignored completely.
- Some days inside the window may have zero activity. Those days should not appear in the output.
- Multiple sessions from the same user on the same day still count as one active user.
Approaches
Brute Force Approach
A brute force solution would iterate through every date in the 30 day range and, for each date, scan the entire Activity table to determine which users were active on that day.
For each date:
- Traverse all rows in the table.
- Check whether the row belongs to the current date.
- Add the
user_idto a set for deduplication. - Count the size of the set after scanning all rows.
This approach is correct because the set guarantees distinct users are counted only once. However, it repeatedly scans the entire table for every date in the range.
If there are N rows and D = 30 days, the total complexity becomes O(N * D). While acceptable for small datasets, it is inefficient because the same rows are revisited many times.
Optimal Approach
The key insight is that SQL databases are designed for grouping and aggregation operations. Instead of processing each date separately, we can filter the relevant rows once and group them by activity_date.
For each date, we count distinct users directly using:
COUNT(DISTINCT user_id)
This automatically handles:
- Multiple activities from the same user
- Duplicate rows
- Multiple sessions on the same day
The database performs the grouping efficiently in a single pass over the filtered rows.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(N * D) | O(U) | Repeatedly scans the table for each day |
| Optimal | O(N) | O(U) | Single filtering and grouping pass |
Here:
Nis the number of rows in the tableDis the number of days in the range, fixed at 30Uis the number of distinct users for a day
Algorithm Walkthrough
- Filter the rows to include only dates within the required 30 day window.
The window is inclusive and ends on 2019-07-27. Therefore, the valid range is:
2019-06-28 <= activity_date <= 2019-07-27
- Group the filtered rows by
activity_date.
This allows us to process each day independently and compute the active user count for that specific date. 3. Count distinct users within each group.
We use COUNT(DISTINCT user_id) because a user may have multiple activity rows on the same day.
4. Return the resulting date and active user count.
Days with no activity naturally disappear because no group exists for them.
Why it works
The algorithm works because grouping by activity_date partitions all valid rows into separate daily buckets. Within each bucket, counting distinct user_id values ensures each user contributes exactly once to the count for that day, regardless of duplicates or multiple activities.
Python Solution
# Write your MySQL query statement below
SELECT
activity_date AS day,
COUNT(DISTINCT user_id) AS active_users
FROM Activity
WHERE activity_date BETWEEN '2019-06-28' AND '2019-07-27'
GROUP BY activity_date;
The solution begins by filtering rows using the WHERE clause. This removes all activities outside the required 30 day interval.
Next, the query groups rows by activity_date. Each group represents all activities performed on a specific day.
Finally, COUNT(DISTINCT user_id) counts unique active users for each day. The DISTINCT keyword is essential because users may have multiple activity records on the same date.
The alias day matches the required output format.
Go Solution
// Write your MySQL query statement below
SELECT
activity_date AS day,
COUNT(DISTINCT user_id) AS active_users
FROM Activity
WHERE activity_date BETWEEN '2019-06-28' AND '2019-07-27'
GROUP BY activity_date;
Since this is a database problem, the SQL solution is identical regardless of the programming language section. There are no Go specific implementation concerns because the logic is executed entirely within the database engine.
Worked Examples
Example 1
Input table:
| user_id | session_id | activity_date | activity_type |
|---|---|---|---|
| 1 | 1 | 2019-07-20 | open_session |
| 1 | 1 | 2019-07-20 | scroll_down |
| 1 | 1 | 2019-07-20 | end_session |
| 2 | 4 | 2019-07-20 | open_session |
| 2 | 4 | 2019-07-21 | send_message |
| 2 | 4 | 2019-07-21 | end_session |
| 3 | 2 | 2019-07-21 | open_session |
| 3 | 2 | 2019-07-21 | send_message |
| 3 | 2 | 2019-07-21 | end_session |
| 4 | 3 | 2019-06-25 | open_session |
| 4 | 3 | 2019-06-25 | end_session |
Step 1, Filter by Date Range
Valid range:
2019-06-28 to 2019-07-27
Rows from 2019-06-25 are removed.
Remaining rows:
| user_id | activity_date |
|---|---|
| 1 | 2019-07-20 |
| 1 | 2019-07-20 |
| 1 | 2019-07-20 |
| 2 | 2019-07-20 |
| 2 | 2019-07-21 |
| 2 | 2019-07-21 |
| 3 | 2019-07-21 |
| 3 | 2019-07-21 |
| 3 | 2019-07-21 |
Step 2, Group by Date
| Date | Users Seen |
|---|---|
| 2019-07-20 | 1, 1, 1, 2 |
| 2019-07-21 | 2, 2, 3, 3, 3 |
Step 3, Count Distinct Users
| Date | Distinct Users | Count |
|---|---|---|
| 2019-07-20 | {1, 2} | 2 |
| 2019-07-21 | {2, 3} | 2 |
Final Output
| day | active_users |
|---|---|
| 2019-07-20 | 2 |
| 2019-07-21 | 2 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(N) | Each relevant row is processed once during filtering and grouping |
| Space | O(U) | The database may maintain distinct user sets during aggregation |
The query scans the table once, applies the date filter, groups rows by date, and counts distinct users within each group. The dominant cost is linear in the number of rows processed.
Test Cases
# Example case
assert solution == [
["2019-07-20", 2],
["2019-07-21", 2]
] # standard example from problem statement
# Single user with multiple activities same day
assert solution == [
["2019-07-20", 1]
] # user counted once despite multiple actions
# Duplicate rows
assert solution == [
["2019-07-20", 1]
] # duplicate activity rows should not increase count
# Activities outside valid range
assert solution == [] # no rows inside the 30 day window
# Multiple users on same day
assert solution == [
["2019-07-25", 5]
] # verifies distinct counting for many users
# Same user across multiple sessions same day
assert solution == [
["2019-07-26", 1]
] # user still counts once
# Multiple valid dates
assert solution == [
["2019-07-20", 2],
["2019-07-21", 3],
["2019-07-22", 1]
] # grouping by day works correctly
# Boundary date inclusion
assert solution == [
["2019-06-28", 1],
["2019-07-27", 1]
] # both endpoints are included
Test Case Summary
| Test | Why |
|---|---|
| Standard example | Verifies basic grouping and counting |
| Multiple activities same day | Ensures distinct user counting |
| Duplicate rows | Confirms duplicates do not inflate counts |
| Outside range only | Verifies date filtering |
| Many users same day | Tests aggregation correctness |
| Multiple sessions same user | Ensures user counted once |
| Multiple dates | Confirms grouping by date |
| Boundary dates | Verifies inclusive range handling |
Edge Cases
Multiple Activities by the Same User
A user may perform many actions on the same day, such as opening a session, scrolling, and sending messages. A naive query using COUNT(*) would count every row and overestimate the number of active users.
The implementation avoids this bug by using:
COUNT(DISTINCT user_id)
This guarantees each user contributes only once per day.
Duplicate Rows in the Table
The problem explicitly states that duplicate rows may exist. Without deduplication, repeated rows would inflate the active user count.
Because the query counts distinct user IDs instead of rows, duplicates do not affect the result.
Activities Outside the 30 Day Window
Rows outside the required interval must be ignored completely. A common mistake is incorrectly computing the start date or using an exclusive range.
The implementation correctly includes both endpoints using:
WHERE activity_date BETWEEN '2019-06-28' AND '2019-07-27'
The BETWEEN operator is inclusive in SQL, which matches the problem statement exactly.