LeetCode 511 - Game Play Analysis I
This problem asks us to determine the first day each player logged into a game based on a table called Activity. Each row of this table contains a playerid, the deviceid used, the eventdate on which the player logged in, and the number of gamesplayed during that session.
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
This problem asks us to determine the first day each player logged into a game based on a table called Activity. Each row of this table contains a player_id, the device_id used, the event_date on which the player logged in, and the number of games_played during that session. The player_id and event_date combination is unique, so no player has multiple records for the same day.
The expected output is a table with two columns: player_id and first_login. The first_login is the earliest event_date for that player across all their activity records. The order of the output rows does not matter.
Constraints are simple: every player has at least one record, dates are valid, and the table may have multiple records per player. Important edge cases include players with only one login, multiple logins on the same day on different devices, and ensuring date comparison is handled correctly.
The problem guarantees that (player_id, event_date) is unique, so we do not need to handle duplicates for the same player on the same day.
Approaches
The most straightforward approach is brute-force: we could iterate through all records and, for each player, keep track of the smallest date we have seen. This works correctly because we compare all dates for each player, but it is not the most efficient if the table is large. In SQL, this would mean a subquery or multiple joins, which can be slow.
The optimal approach leverages SQL aggregation functions. Since we want the earliest date for each player, the MIN function on event_date grouped by player_id naturally gives us the answer efficiently. This approach scans the table only once and uses the database's internal optimizations for grouping and aggregation.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) | O(p) | Iterate over all records, track earliest date per player manually |
| Optimal | O(n) | O(p) | Use GROUP BY player_id with MIN(event_date) to get the first login |
Here, n is the total number of activity records, and p is the number of unique players.
Algorithm Walkthrough
- Select
player_idfrom theActivitytable. This ensures each player is represented once. - Use the
MIN()function on theevent_datecolumn.MIN()automatically finds the earliest date in the group. - Group the results by
player_idusingGROUP BY. This aggregates all records for each player, allowingMIN(event_date)to operate per player. - Return the results with the columns
player_idandfirst_login(alias forMIN(event_date)).
Why it works: The grouping ensures each player only contributes to a single output row, and MIN(event_date) guarantees that we pick the earliest login date among all their activity records. SQL guarantees deterministic aggregation for these operations.
Python Solution
Since this is a database problem, in Python we would typically execute the SQL query using a connection object:
import sqlite3
from typing import List, Tuple
def first_login(conn: sqlite3.Connection) -> List[Tuple[int, str]]:
query = """
SELECT player_id, MIN(event_date) AS first_login
FROM Activity
GROUP BY player_id
"""
cursor = conn.cursor()
cursor.execute(query)
result = cursor.fetchall()
return result
In this implementation, we use sqlite3 for demonstration. We construct the SQL query with MIN(event_date) and GROUP BY player_id. The cursor executes the query, and fetchall() returns the resulting list of tuples, where each tuple contains a player_id and their first_login date.
Go Solution
In Go, we would execute a SQL query using the database/sql package:
package main
import (
"database/sql"
_ "github.com/mattn/go-sqlite3"
)
func FirstLogin(db *sql.DB) ([]struct{ PlayerID int; FirstLogin string }, error) {
rows, err := db.Query(`
SELECT player_id, MIN(event_date) AS first_login
FROM Activity
GROUP BY player_id
`)
if err != nil {
return nil, err
}
defer rows.Close()
var results []struct {
PlayerID int
FirstLogin string
}
for rows.Next() {
var r struct {
PlayerID int
FirstLogin string
}
if err := rows.Scan(&r.PlayerID, &r.FirstLogin); err != nil {
return nil, err
}
results = append(results, r)
}
return results, nil
}
In Go, the differences from Python are mostly syntax-related. We define a struct for each row, iterate through rows.Next(), and scan each row into the struct. We also handle errors explicitly and defer closing the rows to avoid resource leaks.
Worked Examples
Example table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+
Step by step:
- Group by
player_id:
| player_id | event_dates |
|---|---|
| 1 | 2016-03-01, 2016-05-02 |
| 2 | 2017-06-25 |
| 3 | 2016-03-02, 2018-07-03 |
- Apply
MIN(event_date)per player:
| player_id | first_login |
|---|---|
| 1 | 2016-03-01 |
| 2 | 2017-06-25 |
| 3 | 2016-03-02 |
This matches the expected output.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | The database scans all n records once to perform grouping and aggregation |
| Space | O(p) | Space is needed for storing p results, one per player |
The database query optimizer ensures efficient execution; the main cost is reading all records, which is unavoidable. The output is proportional to the number of players.
Test Cases
# test cases
import sqlite3
conn = sqlite3.connect(":memory:")
conn.execute("""
CREATE TABLE Activity (
player_id INT,
device_id INT,
event_date DATE,
games_played INT,
PRIMARY KEY(player_id, event_date)
)
""")
# Insert data
data = [
(1, 2, '2016-03-01', 5),
(1, 2, '2016-05-02', 6),
(2, 3, '2017-06-25', 1),
(3, 1, '2016-03-02', 0),
(3, 4, '2018-07-03', 5)
]
conn.executemany("INSERT INTO Activity VALUES (?, ?, ?, ?)", data)
conn.commit()
result = first_login(conn)
assert (1, '2016-03-01') in result # Player 1 earliest
assert (2, '2017-06-25') in result # Player 2 only login
assert (3, '2016-03-02') in result # Player 3 earliest
# Single player, single login
conn.execute("INSERT INTO Activity VALUES (4, 1, '2020-01-01', 2)")
conn.commit()
result = first_login(conn)
assert (4, '2020-01-01') in result
# Multiple logins same day
conn.execute("INSERT INTO Activity VALUES (5, 1, '2022-12-01', 1)")
conn.execute("INSERT INTO Activity VALUES (5, 2, '2022-12-01', 3)")
conn.commit()
result = first_login(conn)
assert (5, '2022-12-01') in result
| Test | Why |
|---|---|
| Multiple logins, different dates | Validates MIN picks earliest date |
| Single login | Handles minimal data case |
| Multiple logins, same day, different devices | Ensures duplicates on same day do not affect MIN |
Edge Cases
First, a player with only one login. This is the simplest case, and the algorithm correctly returns that single date because MIN of one value is the value itself.
Second, multiple logins on the same day across different devices. Since (player_id, event_date) is unique, the SQL query still correctly identifies the earliest login date, which is the shared date, demonstrating that device differences do not interfere.
Third, a player with very large number of logins over years. The algorithm scales efficiently because GROUP BY with aggregation operates in a single pass over the data, so even if a player has thousands of entries, only the minimum date is retained in