LeetCode 1148 - Article Views I

The problem is asking us to identify all authors who have viewed at least one of their own articles. The input is a database table Views that contains information about articles, their authors, the viewers, and the dates on which the articles were viewed.

LeetCode Problem 1148

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

The problem is asking us to identify all authors who have viewed at least one of their own articles. The input is a database table Views that contains information about articles, their authors, the viewers, and the dates on which the articles were viewed. Each row indicates a single view event, and duplicate rows are possible, meaning the same viewer could appear multiple times for the same article on the same date.

The output should be a table containing a single column id, representing the author_ids who have viewed their own articles. This output must be sorted in ascending order. The key constraint is that an author viewing their own article is indicated when the author_id matches the viewer_id. Duplicate rows should not affect the output, meaning each author should only appear once.

Important edge cases include situations where no author views their own articles, multiple duplicates exist for the same author-viewer pair, or all authors only view other people's articles. The problem guarantees that author_id and viewer_id are integers and that the Views table may contain duplicate entries.

Approaches

The brute-force approach is to iterate over all rows and for each author, check whether there exists any row where author_id equals viewer_id. We can maintain a set to store authors who meet this condition to avoid duplicates. This approach is correct because it directly checks the condition for each row. However, iterating through all rows manually may not be the most efficient, especially for large datasets, because we are performing comparisons on every single row without leveraging SQL's built-in capabilities for filtering and grouping.

The optimal approach is to use a SQL SELECT DISTINCT query with a WHERE clause that directly filters rows where author_id = viewer_id. Using DISTINCT ensures duplicate entries do not appear in the output, and ORDER BY guarantees ascending order. This approach leverages the database engine's optimized filtering and sorting, making it efficient even for large datasets.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(n) Iterate over all rows, store authors in a set if they view their own articles
Optimal O(n log n) O(n) SQL engine handles filtering, deduplication, and sorting efficiently

Algorithm Walkthrough

  1. Start by querying the Views table for rows where author_id equals viewer_id. This ensures we only consider views where authors viewed their own articles.
  2. Use DISTINCT to eliminate duplicate rows where the same author might appear multiple times for the same article or date.
  3. Sort the resulting set of author_ids in ascending order using ORDER BY.
  4. Return the sorted set as a single-column result table with the column named id.

This works because the condition author_id = viewer_id is exactly what defines an author viewing their own article. Using DISTINCT ensures that duplicate view records do not cause repeated entries in the output. Sorting ensures the output matches the problem's required order.

Python Solution

# There is no Python SQL execution class required for LeetCode, 
# this is a direct SQL submission problem.
# If we were using Python to describe the logic, it might look like:

import sqlite3
from typing import List, Tuple

def authors_who_viewed_their_own_articles(views: List[Tuple[int, int, int, str]]) -> List[int]:
    result_set = set()
    for article_id, author_id, viewer_id, view_date in views:
        if author_id == viewer_id:
            result_set.add(author_id)
    return sorted(result_set)

In this Python implementation, we iterate through each view record, check if the author viewed their own article, and store qualifying authors in a set to avoid duplicates. Finally, we return a sorted list.

Go Solution

package main

import (
    "sort"
)

func authorsWhoViewedOwnArticles(views [][]interface{}) []int {
    authorSet := make(map[int]struct{})
    for _, view := range views {
        authorID := view[1].(int)
        viewerID := view[2].(int)
        if authorID == viewerID {
            authorSet[authorID] = struct{}{}
        }
    }
    result := make([]int, 0, len(authorSet))
    for id := range authorSet {
        result = append(result, id)
    }
    sort.Ints(result)
    return result
}

In Go, we use a map to simulate a set, storing authors who view their own articles. After collecting all relevant authors, we convert the map keys to a slice and sort it before returning. This approach mirrors the Python solution but uses Go-specific constructs.

Worked Examples

For the example input:

Views table:
+------------+-----------+-----------+------------+
| article_id | author_id | viewer_id | view_date  |
+------------+-----------+-----------+------------+
| 1          | 3         | 5         | 2019-08-01 |
| 1          | 3         | 6         | 2019-08-02 |
| 2          | 7         | 7         | 2019-08-01 |
| 2          | 7         | 6         | 2019-08-02 |
| 4          | 7         | 1         | 2019-07-22 |
| 3          | 4         | 4         | 2019-07-21 |
| 3          | 4         | 4         | 2019-07-21 |
+------------+-----------+-----------+------------+

Iteration and checks:

  1. Row 1: author 3 ≠ viewer 5 → skip
  2. Row 2: author 3 ≠ viewer 6 → skip
  3. Row 3: author 7 = viewer 7 → add 7
  4. Row 4: author 7 ≠ viewer 6 → skip
  5. Row 5: author 7 ≠ viewer 1 → skip
  6. Row 6: author 4 = viewer 4 → add 4
  7. Row 7: author 4 = viewer 4 → already in set, skip

Final set: {4, 7}, sorted → [4, 7]

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Iterate through n rows and sort the resulting unique authors
Space O(n) Store at most all authors in the set/map

The time complexity is dominated by sorting the unique authors, while the iteration through the table is linear. Space is linear with respect to the number of distinct authors.

Test Cases

# test cases
assert authors_who_viewed_their_own_articles([]) == []  # empty input
assert authors_who_viewed_their_own_articles([(1, 1, 1, "2021-01-01")]) == [1]  # single row, self-view
assert authors_who_viewed_their_own_articles([(1, 1, 2, "2021-01-01")]) == []  # single row, not self-view
assert authors_who_viewed_their_own_articles([
    (1, 3, 5, "2019-08-01"),
    (1, 3, 6, "2019-08-02"),
    (2, 7, 7, "2019-08-01"),
    (2, 7, 6, "2019-08-02"),
    (4, 7, 1, "2019-07-22"),
    (3, 4, 4, "2019-07-21"),
    (3, 4, 4, "2019-07-21")
]) == [4, 7]  # example case
assert authors_who_viewed_their_own_articles([
    (1, 10, 10, "2022-02-02"),
    (2, 10, 10, "2022-02-03"),
    (3, 11, 12, "2022-02-03")
]) == [10]  # duplicates and mixed views
Test Why
empty input validates function handles no data
single self-view verifies smallest positive case
single non-self-view verifies function does not falsely include non-self-view
example case ensures algorithm matches problem statement
duplicates tests handling of repeated author-viewer entries

Edge Cases

One edge case is an empty Views table. The function must return an empty list, as there are no authors to check.

Another edge case is when all rows contain views by authors who never view their own articles. A naive implementation that does not check author_id = viewer_id could mistakenly include authors.

A third edge case is multiple duplicate rows for the same author-viewer pair. The implementation must avoid returning the same author multiple times. Using a set or SQL DISTINCT ensures this is handled correctly.