LeetCode 1148 - Article Views I
The problem is asking us to identify all authors who have viewed at least one of their own articles. The input is a database table Views that contains information about articles, their authors, the viewers, and the dates on which the articles were viewed.
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
The problem is asking us to identify all authors who have viewed at least one of their own articles. The input is a database table Views that contains information about articles, their authors, the viewers, and the dates on which the articles were viewed. Each row indicates a single view event, and duplicate rows are possible, meaning the same viewer could appear multiple times for the same article on the same date.
The output should be a table containing a single column id, representing the author_ids who have viewed their own articles. This output must be sorted in ascending order. The key constraint is that an author viewing their own article is indicated when the author_id matches the viewer_id. Duplicate rows should not affect the output, meaning each author should only appear once.
Important edge cases include situations where no author views their own articles, multiple duplicates exist for the same author-viewer pair, or all authors only view other people's articles. The problem guarantees that author_id and viewer_id are integers and that the Views table may contain duplicate entries.
Approaches
The brute-force approach is to iterate over all rows and for each author, check whether there exists any row where author_id equals viewer_id. We can maintain a set to store authors who meet this condition to avoid duplicates. This approach is correct because it directly checks the condition for each row. However, iterating through all rows manually may not be the most efficient, especially for large datasets, because we are performing comparisons on every single row without leveraging SQL's built-in capabilities for filtering and grouping.
The optimal approach is to use a SQL SELECT DISTINCT query with a WHERE clause that directly filters rows where author_id = viewer_id. Using DISTINCT ensures duplicate entries do not appear in the output, and ORDER BY guarantees ascending order. This approach leverages the database engine's optimized filtering and sorting, making it efficient even for large datasets.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) | O(n) | Iterate over all rows, store authors in a set if they view their own articles |
| Optimal | O(n log n) | O(n) | SQL engine handles filtering, deduplication, and sorting efficiently |
Algorithm Walkthrough
- Start by querying the
Viewstable for rows whereauthor_idequalsviewer_id. This ensures we only consider views where authors viewed their own articles. - Use
DISTINCTto eliminate duplicate rows where the same author might appear multiple times for the same article or date. - Sort the resulting set of
author_ids in ascending order usingORDER BY. - Return the sorted set as a single-column result table with the column named
id.
This works because the condition author_id = viewer_id is exactly what defines an author viewing their own article. Using DISTINCT ensures that duplicate view records do not cause repeated entries in the output. Sorting ensures the output matches the problem's required order.
Python Solution
# There is no Python SQL execution class required for LeetCode,
# this is a direct SQL submission problem.
# If we were using Python to describe the logic, it might look like:
import sqlite3
from typing import List, Tuple
def authors_who_viewed_their_own_articles(views: List[Tuple[int, int, int, str]]) -> List[int]:
result_set = set()
for article_id, author_id, viewer_id, view_date in views:
if author_id == viewer_id:
result_set.add(author_id)
return sorted(result_set)
In this Python implementation, we iterate through each view record, check if the author viewed their own article, and store qualifying authors in a set to avoid duplicates. Finally, we return a sorted list.
Go Solution
package main
import (
"sort"
)
func authorsWhoViewedOwnArticles(views [][]interface{}) []int {
authorSet := make(map[int]struct{})
for _, view := range views {
authorID := view[1].(int)
viewerID := view[2].(int)
if authorID == viewerID {
authorSet[authorID] = struct{}{}
}
}
result := make([]int, 0, len(authorSet))
for id := range authorSet {
result = append(result, id)
}
sort.Ints(result)
return result
}
In Go, we use a map to simulate a set, storing authors who view their own articles. After collecting all relevant authors, we convert the map keys to a slice and sort it before returning. This approach mirrors the Python solution but uses Go-specific constructs.
Worked Examples
For the example input:
Views table:
+------------+-----------+-----------+------------+
| article_id | author_id | viewer_id | view_date |
+------------+-----------+-----------+------------+
| 1 | 3 | 5 | 2019-08-01 |
| 1 | 3 | 6 | 2019-08-02 |
| 2 | 7 | 7 | 2019-08-01 |
| 2 | 7 | 6 | 2019-08-02 |
| 4 | 7 | 1 | 2019-07-22 |
| 3 | 4 | 4 | 2019-07-21 |
| 3 | 4 | 4 | 2019-07-21 |
+------------+-----------+-----------+------------+
Iteration and checks:
- Row 1: author 3 ≠ viewer 5 → skip
- Row 2: author 3 ≠ viewer 6 → skip
- Row 3: author 7 = viewer 7 → add 7
- Row 4: author 7 ≠ viewer 6 → skip
- Row 5: author 7 ≠ viewer 1 → skip
- Row 6: author 4 = viewer 4 → add 4
- Row 7: author 4 = viewer 4 → already in set, skip
Final set: {4, 7}, sorted → [4, 7]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Iterate through n rows and sort the resulting unique authors |
| Space | O(n) | Store at most all authors in the set/map |
The time complexity is dominated by sorting the unique authors, while the iteration through the table is linear. Space is linear with respect to the number of distinct authors.
Test Cases
# test cases
assert authors_who_viewed_their_own_articles([]) == [] # empty input
assert authors_who_viewed_their_own_articles([(1, 1, 1, "2021-01-01")]) == [1] # single row, self-view
assert authors_who_viewed_their_own_articles([(1, 1, 2, "2021-01-01")]) == [] # single row, not self-view
assert authors_who_viewed_their_own_articles([
(1, 3, 5, "2019-08-01"),
(1, 3, 6, "2019-08-02"),
(2, 7, 7, "2019-08-01"),
(2, 7, 6, "2019-08-02"),
(4, 7, 1, "2019-07-22"),
(3, 4, 4, "2019-07-21"),
(3, 4, 4, "2019-07-21")
]) == [4, 7] # example case
assert authors_who_viewed_their_own_articles([
(1, 10, 10, "2022-02-02"),
(2, 10, 10, "2022-02-03"),
(3, 11, 12, "2022-02-03")
]) == [10] # duplicates and mixed views
| Test | Why |
|---|---|
| empty input | validates function handles no data |
| single self-view | verifies smallest positive case |
| single non-self-view | verifies function does not falsely include non-self-view |
| example case | ensures algorithm matches problem statement |
| duplicates | tests handling of repeated author-viewer entries |
Edge Cases
One edge case is an empty Views table. The function must return an empty list, as there are no authors to check.
Another edge case is when all rows contain views by authors who never view their own articles. A naive implementation that does not check author_id = viewer_id could mistakenly include authors.
A third edge case is multiple duplicate rows for the same author-viewer pair. The implementation must avoid returning the same author multiple times. Using a set or SQL DISTINCT ensures this is handled correctly.