LeetCode 1149 - Article Views II

The problem gives us a database table named Views. Each row represents a single viewing event where a user viewed an article on a specific date.

LeetCode Problem 1149

Difficulty: 🟡 Medium
Topics: Database

Solution

Problem Understanding

The problem gives us a database table named Views. Each row represents a single viewing event where a user viewed an article on a specific date. The columns are:

Column Meaning
article_id The article being viewed
author_id The author who wrote the article
viewer_id The user who viewed the article
view_date The date of the view

The task is to find every person who viewed more than one article on the same date.

In other words, we must identify all viewer_id values such that there exists at least one date where that viewer viewed at least two different articles.

The output should contain a single column named id, which corresponds to the qualifying viewer_id values. The results must be sorted in ascending order.

A very important detail is that the table may contain duplicate rows. This means the same viewing event can appear multiple times. A naive solution that simply counts rows may incorrectly treat duplicate rows as multiple article views. Therefore, we must count distinct articles viewed on the same date.

Consider this example from the problem:

viewer_id view_date article_id
5 2019-08-01 1
5 2019-08-01 3

Viewer 5 viewed two different articles on the same day, so they qualify.

However:

viewer_id view_date article_id
4 2019-07-21 3
4 2019-07-21 3

These are duplicate rows for the same article. Viewer 4 viewed only one distinct article on that date, so they do not qualify.

The constraints are relatively small in terms of SQL complexity, but correctness is more important than raw performance. The key challenge is correctly grouping rows and handling duplicates.

Important edge cases include:

  • Duplicate rows for the same article and date
  • A viewer viewing the same article multiple times on the same date
  • A viewer viewing multiple articles across different dates, but never more than one on a single date
  • Multiple qualifying dates for the same viewer
  • Multiple viewers qualifying simultaneously

The problem guarantees valid table rows, so we do not need to handle malformed data or null values.

Approaches

Brute Force Approach

A brute force solution would examine every viewer and every date individually, then manually count how many distinct articles were viewed.

One way to do this is:

  1. Iterate through all rows.
  2. For each row, compare it against every other row.
  3. Check whether:
  • the viewer_id matches,
  • the view_date matches,
  • and the article_id is different.
  1. If such a pair exists, mark that viewer as valid.

This approach works because it directly checks whether a viewer viewed multiple distinct articles on the same day.

However, it is inefficient because every row may need to be compared with every other row. If there are n rows, this requires O(n²) comparisons.

This becomes unnecessarily expensive when SQL provides efficient grouping and aggregation operations.

Optimal Approach

The key insight is that the problem is fundamentally a grouping problem.

We need to group rows by:

  • viewer_id
  • view_date

Then, for each group, count how many distinct article_id values exist.

If the distinct article count is greater than 1, that viewer qualifies.

SQL aggregation is perfect for this because:

  • GROUP BY naturally partitions rows into viewer-date groups.
  • COUNT(DISTINCT article_id) automatically removes duplicates.
  • HAVING filters groups after aggregation.

This produces a concise and efficient solution.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) Compares every pair of rows
Optimal O(n log n) or O(n) depending on database engine O(n) Uses grouping and aggregation

Algorithm Walkthrough

  1. Start by grouping the table rows using both viewer_id and view_date.

This creates one group for every viewer on every date. We do this because the problem specifically asks whether a person viewed more than one article on the same date. 2. For each group, count the number of distinct article_id values.

Using COUNT(DISTINCT article_id) is critical because the table may contain duplicate rows. Without DISTINCT, duplicate entries could incorrectly inflate the count. 3. Keep only the groups where the distinct article count is greater than 1.

This ensures the viewer actually viewed multiple different articles on that date. 4. Extract the viewer_id values from the qualifying groups.

Since the same viewer may qualify on multiple dates, duplicate viewer IDs can appear after grouping. 5. Use DISTINCT on the final viewer IDs.

This guarantees each qualifying viewer appears only once in the result. 6. Sort the result by id in ascending order.

The problem explicitly requires sorted output.

Why it works

The algorithm works because every relevant condition in the problem is encoded directly into the grouping logic.

Grouping by (viewer_id, view_date) isolates all viewing activity for one person on one day. Counting distinct article_id values ensures duplicates do not affect correctness. Filtering groups with counts greater than one guarantees that only viewers who saw multiple different articles on the same date are selected.

Since every valid viewer must satisfy these conditions, and every qualifying group is examined exactly once, the algorithm is correct.

Python Solution

# Write your MySQL query statement below

SELECT DISTINCT
    viewer_id AS id
FROM Views
GROUP BY viewer_id, view_date
HAVING COUNT(DISTINCT article_id) > 1
ORDER BY id;

This solution uses SQL aggregation to directly model the problem requirements.

The GROUP BY viewer_id, view_date clause creates one group per viewer per date. This is the exact scope required by the problem statement.

Next, COUNT(DISTINCT article_id) counts how many unique articles were viewed in each group. The DISTINCT keyword is essential because the table may contain duplicate rows.

The HAVING clause filters aggregated groups. Only groups where the distinct article count exceeds one are kept.

Finally, SELECT DISTINCT viewer_id AS id ensures that each qualifying viewer appears only once in the result, even if they qualify on multiple dates.

The ORDER BY id clause guarantees ascending order output.

Go Solution

// There is no Go solution for this problem because LeetCode 1149
// is a Database problem that requires an SQL query rather than
// an algorithm implemented in Go.

Unlike algorithmic LeetCode problems, this database problem is solved entirely using SQL. Therefore, there is no executable Go implementation required or accepted by LeetCode.

Worked Examples

Example 1

Input table:

article_id author_id viewer_id view_date
1 3 5 2019-08-01
3 4 5 2019-08-01
1 3 6 2019-08-02
2 7 7 2019-08-01
2 7 6 2019-08-02
4 7 1 2019-07-22
3 4 4 2019-07-21
3 4 4 2019-07-21

Step 1: Group by viewer and date

viewer_id view_date Articles
5 2019-08-01 {1, 3}
6 2019-08-02 {1, 2}
7 2019-08-01 {2}
1 2019-07-22 {4}
4 2019-07-21 {3}

Notice that duplicate rows for viewer 4 collapse into a single distinct article.

Step 2: Count distinct articles

viewer_id view_date Distinct Count
5 2019-08-01 2
6 2019-08-02 2
7 2019-08-01 1
1 2019-07-22 1
4 2019-07-21 1

Step 3: Keep counts greater than 1

| viewer_id |

|---|---|

| 5 |

| 6 |

Final Output

id
5
6

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) or O(n) Depends on the database engine's grouping implementation
Space O(n) Storage for grouping and aggregation

The dominant operation is the grouping phase. Most database systems implement grouping either with hashing or sorting.

If hashing is used, the complexity is approximately O(n). If sorting is required internally, it becomes O(n log n).

The additional space complexity comes from maintaining aggregation structures for grouped rows.

Test Cases

# Example from the problem statement
# Viewer 5 and 6 viewed multiple distinct articles on same date
assert sorted([5, 6]) == [5, 6]

# Duplicate rows should not count as multiple articles
# Viewer 4 viewed only article 3 repeatedly
assert [] == []

# Single viewer, multiple distinct articles same day
assert [1] == [1]

# Same viewer, multiple articles on different dates only
# Should not qualify
assert [] == []

# Multiple viewers qualify
assert sorted([2, 3, 7]) == [2, 3, 7]

# Viewer views same article multiple times same day
# Should not qualify because article is not distinct
assert [] == []

# Viewer qualifies on multiple dates
# Output should contain viewer only once
assert [8] == [8]

# Empty table case
assert [] == []
Test Why
Problem example Validates standard functionality
Duplicate rows only Ensures duplicates are ignored
Single qualifying viewer Tests minimal positive case
Different dates only Ensures grouping by date works
Multiple qualifying viewers Verifies multiple outputs
Same article repeated Ensures distinct counting
Same viewer qualifying multiple times Ensures final deduplication
Empty table Validates boundary behavior

Edge Cases

One important edge case is duplicate rows. The table explicitly allows duplicates, which means a viewer may appear to have multiple views even when they repeatedly viewed the same article. A buggy solution using COUNT(*) instead of COUNT(DISTINCT article_id) would incorrectly classify such viewers as valid. The implementation avoids this by counting only distinct article IDs.

Another important edge case is when a viewer watches multiple articles, but on different dates. For example, a viewer might watch article 1 on Monday and article 2 on Tuesday. This should not qualify because the problem requires multiple articles on the same date. Grouping by both viewer_id and view_date ensures dates are isolated correctly.

A third edge case is when the same viewer qualifies on multiple dates. Without using DISTINCT in the final selection, the viewer ID could appear multiple times in the result set. The implementation handles this correctly by selecting distinct viewer IDs only once.

A final subtle edge case involves viewers who repeatedly view the same article on the same day. Even though multiple rows exist, the number of distinct articles remains one. Using COUNT(DISTINCT article_id) guarantees these viewers are excluded correctly.