LeetCode 1113 - Reported Posts
The problem is asking us to determine the number of posts that were reported yesterday grouped by the reason for the report.
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
The problem is asking us to determine the number of posts that were reported yesterday grouped by the reason for the report. We are given an Actions table where each row represents a user performing an action on a post, with the action type recorded in the action column and optional extra information in the extra column. In this context, a report action may include a reason in extra such as "spam" or "racism". The problem specifies that "today" is 2019-07-05, so "yesterday" refers to 2019-07-04.
The input table may have multiple rows for the same post, same user, or same action, meaning duplicates exist. We only care about report actions and need to count how many reports were made for each report_reason on 2019-07-04. The output should be a table with two columns: report_reason and report_count, including only reasons with at least one report.
Important points and constraints include: the table may have duplicates, there can be multiple reports of the same post for the same reason, and the extra column may be NULL for actions other than report. Edge cases to consider include posts reported multiple times by the same user for the same reason, days with no reports, and multiple report reasons.
Approaches
The brute-force approach is to scan the entire Actions table, filter rows where action = 'report' and action_date = '2019-07-04', and manually aggregate counts by reason, for example using a Python dictionary or SQL group by. This approach works correctly, but if the table is very large, scanning every row and then processing in application code is less efficient than letting the database handle grouping.
The optimal approach leverages SQL aggregation: filter directly in SQL for report actions on the target date, group by extra (the reason), and count the number of rows per group. This avoids manual aggregation, reduces memory usage, and efficiently uses database indexing on action_date or action. The key insight is that SQL's GROUP BY with COUNT naturally handles duplicates and aggregation correctly.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) | O(r) | Iterate all rows in the table and count manually per reason |
| Optimal | O(n) | O(r) | SQL GROUP BY handles filtering and counting efficiently, where n is the number of rows, r is the number of unique reasons |
Algorithm Walkthrough
- Identify the target date, which is yesterday relative to the fixed date
2019-07-05. This is2019-07-04. - Filter the
Actionstable to select only rows whereaction = 'report'andaction_date = '2019-07-04'. - Group the filtered rows by
extra, which represents the report reason. - For each group, count the number of rows to get
report_count. - Return a table with
report_reasonandreport_count, including only reasons with at least one report.
Why it works: Grouping by extra ensures all reports for the same reason are counted together. Filtering first ensures we only count relevant reports. Using COUNT guarantees we handle multiple reports per user or post correctly.
Python Solution
import pandas as pd
class Solution:
def reportedPosts(self, actions: pd.DataFrame) -> pd.DataFrame:
# Step 1: Define yesterday's date
yesterday = '2019-07-04'
# Step 2: Filter rows where action is 'report' and date is yesterday
reports = actions[(actions['action'] == 'report') & (actions['action_date'] == yesterday)]
# Step 3: Group by report reason (extra) and count
result = reports.groupby('extra').size().reset_index(name='report_count')
# Step 4: Rename column to match expected output
result.rename(columns={'extra': 'report_reason'}, inplace=True)
return result
The Python solution uses Pandas for table manipulation. First, it filters for the target date and report actions. Then, groupby('extra') aggregates counts by report reason. The resulting DataFrame is renamed to match the expected output columns. This directly implements the optimal SQL-like aggregation logic in Python.
Go Solution
package main
import (
"database/sql"
"fmt"
)
func reportedPosts(db *sql.DB) ([]map[string]interface{}, error) {
yesterday := "2019-07-04"
query := `
SELECT extra AS report_reason, COUNT(*) AS report_count
FROM Actions
WHERE action = 'report' AND action_date = ?
GROUP BY extra
`
rows, err := db.Query(query, yesterday)
if err != nil {
return nil, err
}
defer rows.Close()
var results []map[string]interface{}
for rows.Next() {
var reason string
var count int
if err := rows.Scan(&reason, &count); err != nil {
return nil, err
}
results = append(results, map[string]interface{}{
"report_reason": reason,
"report_count": count,
})
}
return results, nil
}
In Go, we execute an SQL query directly against the database. We use parameterized queries to safely filter by yesterday's date. The rows.Scan function maps SQL columns to Go variables, and we store results in a slice of maps for flexibility. Go handles SQL NULLs automatically, and COUNT(*) ensures we count duplicates correctly.
Worked Examples
Example input table:
+---------+---------+-------------+--------+--------+
| user_id | post_id | action_date | action | extra |
+---------+---------+-------------+--------+--------+
| 2 | 4 | 2019-07-04 | report | spam |
| 3 | 4 | 2019-07-04 | report | spam |
| 5 | 2 | 2019-07-04 | report | racism |
| 5 | 5 | 2019-07-04 | report | racism |
+---------+---------+-------------+--------+--------+
Step by step:
- Filter rows for
action='report'andaction_date='2019-07-04':
+---------+---------+-------------+--------+--------+
| 2 | 4 | 2019-07-04 | report | spam |
| 3 | 4 | 2019-07-04 | report | spam |
| 5 | 2 | 2019-07-04 | report | racism |
| 5 | 5 | 2019-07-04 | report | racism |
+---------+---------+-------------+--------+--------+
- Group by
extra:
spam: 2 reportsracism: 2 reports
- Return table:
+---------------+--------------+
| report_reason | report_count |
+---------------+--------------+
| spam | 2 |
| racism | 2 |
+---------------+--------------+
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | We scan the table once to filter reports and then group by reason |
| Space | O(r) | We store counts for each unique report reason, r is number of unique reasons |
Filtering and grouping operations are linear in the number of table rows, and the space required scales with the number of distinct report reasons.
Test Cases
import pandas as pd
actions = pd.DataFrame([
[1, 1, '2019-07-01', 'view', None],
[1, 1, '2019-07-01', 'like', None],
[1, 1, '2019-07-01', 'share', None],
[2, 4, '2019-07-04', 'view', None],
[2, 4, '2019-07-04', 'report', 'spam'],
[3, 4, '2019-07-04', 'view', None],
[3, 4, '2019-07-04', 'report', 'spam'],
[4, 3, '2019-07-02', 'view', None],
[4, 3, '2019-07-02', 'report', 'spam'],
[5, 2, '2019-07-04', 'view', None],
[5, 2, '2019-07-04', 'report', 'racism'],
[5, 5, '2019-07-04', 'view', None],
[5, 5, '2019-07-04', 'report', 'racism'],
], columns=['user_id', 'post_id', 'action_date', 'action', 'extra'])
sol = Solution()
result = sol.reportedPosts(actions)
assert set([tuple(x) for x in result.to_records(index=False)]) == set([('spam',2),('racism',2)]) # standard example
# Edge case: no reports yesterday
empty_actions = pd.DataFrame([