LeetCode 512 - Game Play Analysis II

The problem provides a database table named Activity. Each row represents one login session for a player on a specific date.

LeetCode Problem 512

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

The problem provides a database table named Activity. Each row represents one login session for a player on a specific date. The table contains four columns:

Column Meaning
player_id The unique identifier for a player
device_id The device used during that login session
event_date The date of the login
games_played Number of games played during that session

The combination (player_id, event_date) is guaranteed to be unique. This means a player can appear multiple times in the table, but never more than once for the same date.

The task is to determine the device used during each player's very first login. In other words, for every unique player_id, we must find the row with the earliest event_date, then return the corresponding device_id.

The output should contain exactly two columns:

Column Meaning
player_id The player
device_id The device used on the player's first login

The order of the output rows does not matter.

The important detail is that we are not looking for the earliest device numerically, nor the most frequently used device. We specifically need the device associated with the minimum login date for each player.

Because (player_id, event_date) is unique, the problem guarantees that each player has exactly one earliest login row. This removes ambiguity and means we do not need tie-breaking logic.

A naive implementation could fail if it incorrectly assumes:

  • the smallest device_id is the answer
  • players appear only once
  • rows are already sorted by date
  • multiple players cannot share the same device

The input may contain players with only one login record, players who switch devices over time, and login dates in arbitrary order. A correct solution must handle all of these cases.

Approaches

Brute Force Approach

A brute-force solution would process each player independently. For every player, we could scan the entire table to locate the earliest event_date. Once that minimum date is found, we could scan again to retrieve the associated device_id.

This approach is correct because it explicitly checks every possible row for every player and therefore cannot miss the earliest login.

However, it is inefficient. If there are n rows and p distinct players, repeatedly scanning the table for each player leads to quadratic behavior in the worst case. This becomes unnecessarily expensive when the table grows large.

Optimal Approach

The key observation is that we only need one row per player, specifically the row with the minimum event_date.

SQL is particularly good at grouped aggregation. We can first compute the earliest login date for each player using MIN(event_date) grouped by player_id. Once we know the earliest date for every player, we can join this result back to the original Activity table to retrieve the matching device_id.

This works efficiently because:

  • aggregation reduces each player to one earliest date
  • the join retrieves the exact corresponding row
  • the primary key guarantee ensures uniqueness

This avoids repeatedly scanning the table and leverages SQL's optimized grouping and join operations.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) Repeatedly scans the table for every player
Optimal O(n) O(n) Uses grouping and join to directly locate first login rows

Algorithm Walkthrough

  1. Start by grouping all rows by player_id.

This allows us to analyze each player's activity independently. 2. For each player, compute the minimum event_date.

The minimum date represents that player's first login date. 3. Store the result as a temporary derived table containing:

  • player_id
  • earliest event_date
  1. Join this derived table back with the original Activity table.

The join condition matches:

  • same player_id
  • same event_date

This identifies the exact original row corresponding to the player's first login. 5. Select the player_id and device_id columns from the matched rows.

These are the required output values.

Why it works

The algorithm works because every player's first login is uniquely determined by the minimum event_date. Since (player_id, event_date) is a primary key, there can only be one row matching that earliest date for a given player. Therefore, joining the grouped result back to the original table always retrieves exactly the correct device.

Python Solution

Even though this is a SQL database problem, LeetCode database problems are solved using SQL queries. The following query is the correct LeetCode submission.

# This problem is solved with SQL, not Python.

SELECT a.player_id, a.device_id
FROM Activity a
JOIN (
    SELECT player_id, MIN(event_date) AS first_login
    FROM Activity
    GROUP BY player_id
) first_activity
ON a.player_id = first_activity.player_id
AND a.event_date = first_activity.first_login;

The query has two major logical sections.

The inner subquery computes the earliest login date for every player. The GROUP BY player_id clause ensures each player is processed independently, while MIN(event_date) extracts the first login date.

The outer query joins the original Activity table with this grouped result. The join conditions ensure that only rows matching both the same player and the earliest login date are selected.

Finally, the query returns the required columns: player_id and device_id.

Go Solution

LeetCode database problems do not require Go code because the expected submission language is SQL. However, the equivalent SQL solution is shown below inside a Go string for reference.

package main

func solution() string {
	return `
SELECT a.player_id, a.device_id
FROM Activity a
JOIN (
    SELECT player_id, MIN(event_date) AS first_login
    FROM Activity
    GROUP BY player_id
) first_activity
ON a.player_id = first_activity.player_id
AND a.event_date = first_activity.first_login;
`
}

Unlike algorithmic problems, there are no Go-specific implementation concerns such as slice management, pointer handling, or integer overflow because the actual execution happens entirely inside the SQL engine.

Worked Examples

Example 1

Input table:

player_id device_id event_date games_played
1 2 2016-03-01 5
1 2 2016-05-02 6
2 3 2017-06-25 1
3 1 2016-03-02 0
3 4 2018-07-03 5

Step 1: Compute earliest login date per player

Using:

SELECT player_id, MIN(event_date)
FROM Activity
GROUP BY player_id;

We get:

player_id first_login
1 2016-03-01
2 2017-06-25
3 2016-03-02

Step 2: Join back to original table

We now match each (player_id, first_login) pair with the original table.

player_id first_login matching device_id
1 2016-03-01 2
2 2017-06-25 3
3 2016-03-02 1

Final Output

player_id device_id
1 2
2 3
3 1

Complexity Analysis

Measure Complexity Explanation
Time O(n) One pass for aggregation and one pass for join
Space O(n) The grouped temporary result may store one row per player

The aggregation step scans all rows once to compute the earliest login date for each player. The join operation also processes rows efficiently using database join mechanisms. Overall, the query is linear relative to the number of table rows.

Test Cases

# Example case from the prompt
activity = [
    (1, 2, "2016-03-01", 5),
    (1, 2, "2016-05-02", 6),
    (2, 3, "2017-06-25", 1),
    (3, 1, "2016-03-02", 0),
    (3, 4, "2018-07-03", 5),
]
# Expected:
# [(1,2), (2,3), (3,1)]

# Single player with one login
activity = [
    (1, 7, "2020-01-01", 3),
]
# Expected:
# [(1,7)]

# Player changes devices over time
activity = [
    (1, 5, "2020-01-01", 1),
    (1, 9, "2021-01-01", 2),
]
# Expected:
# [(1,5)]

# Multiple players sharing same device
activity = [
    (1, 2, "2020-01-01", 1),
    (2, 2, "2020-02-01", 1),
]
# Expected:
# [(1,2), (2,2)]

# Dates not sorted
activity = [
    (1, 3, "2021-05-01", 1),
    (1, 2, "2020-01-01", 1),
]
# Expected:
# [(1,2)]

# Multiple players with multiple records
activity = [
    (1, 1, "2020-01-01", 2),
    (1, 2, "2020-02-01", 3),
    (2, 3, "2019-01-01", 4),
    (2, 4, "2021-01-01", 5),
]
# Expected:
# [(1,1), (2,3)]
Test Why
Provided example Validates standard multi-player behavior
Single login Ensures minimal input works
Device switching Confirms earliest device is selected
Shared devices Ensures grouping is by player, not device
Unsorted dates Confirms algorithm does not rely on ordering
Multiple records per player Verifies aggregation correctness

Edge Cases

One important edge case is when a player appears only once in the table. In this situation, that single row is automatically the earliest login. A buggy implementation might incorrectly expect multiple records per player or fail during grouping. The SQL solution handles this naturally because MIN(event_date) over one value simply returns that value.

Another important edge case occurs when rows are not ordered chronologically. Many naive implementations incorrectly assume the first row encountered for a player is the earliest login. Since SQL tables do not guarantee row order unless explicitly sorted, relying on insertion order is unsafe. Using MIN(event_date) ensures correctness regardless of row arrangement.

A third important edge case involves players switching devices over time. A flawed implementation might accidentally return the smallest device_id or the most recent device instead of the first-login device. The join condition specifically matches the earliest login date, guaranteeing that the returned device_id corresponds exactly to the first login event.