LeetCode 3060 - User Activities within Time Bounds

Difficulty: 🔴 Hard
Topics: Database

Solution

Problem Understanding

The Sessions table stores information about user activity sessions on a platform. Each row represents a single session and contains:

Column	Meaning
`user_id`	The user who performed the session
`session_start`	When the session began
`session_end`	When the session ended
`session_id`	Unique identifier for the session
`session_type`	Either `Viewer` or `Streamer`

The problem asks us to find every user who has at least two sessions of the same type where the gap between the sessions is at most 12 hours.

The important detail is understanding what “gap between sessions” means. If one session ends at time T1 and the next session of the same type starts at time T2, then the gap is:

T2 - T1

We only care about pairs of sessions that:

Belong to the same user
Have the same session_type
Have a time difference between consecutive sessions of at most 12 hours

If at least one such pair exists for a user, that user should appear in the output.

The final result should contain only the user_id column, ordered in ascending order.

A key observation is that we only need to compare sessions of the same type for the same user. Viewer sessions and Streamer sessions are completely independent.

The input size is not explicitly stated in the prompt, but since this is a hard SQL/database problem, we should assume the table may contain many rows. A naive comparison of every pair of sessions would become expensive quickly, especially because comparing all pairs leads to quadratic behavior.

There are several edge cases that can cause mistakes:

A user may have many sessions, but all of different types
Sessions may overlap
Sessions may be far apart except for one valid pair
A user may have exactly two valid sessions
Multiple sessions may occur on the same day
The valid pair may not be adjacent in insertion order, so sorting matters

The problem guarantees that session_id is unique, but that uniqueness is not especially useful for solving the problem. The critical fields are:

user_id
session_type
session_start
session_end

Approaches

Brute Force Approach

The most direct solution is to compare every pair of sessions for each user.

For every user:

Collect all sessions
Compare each session against every other session
Check whether:

The session types match
The second session starts within 12 hours after the first session ends

If any valid pair exists, include the user in the result.

This works because every possible pair is examined, so no valid pair can be missed.

However, this approach is inefficient. If a user has k sessions, then we perform O(k²) comparisons for that user. Across a large dataset, this becomes too slow.

Key Insight

The crucial observation is that once sessions are grouped by:

user_id
session_type

we only need to compare nearby sessions in chronological order.

Suppose we sort sessions by session_start. If two sessions more than 12 hours apart appear consecutively, then sessions even farther away will also fail the condition.

This means we can efficiently check only adjacent sessions after sorting.

In SQL, this pattern is naturally solved using a window function like LEAD() or LAG().

We can:

Partition by user_id and session_type
Order sessions chronologically
Compare each session with the next session
Check whether the time difference is at most 12 hours

This reduces the problem from quadratic comparisons to linear scanning after sorting.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n²)	O(n)	Compares every pair of sessions
Optimal	O(n log n)	O(n)	Sorts sessions and compares adjacent rows

Algorithm Walkthrough

First, group sessions by both user_id and session_type.

This is necessary because the problem only allows comparisons between sessions of the same type belonging to the same user. 2. Within each group, sort sessions by session_start.

Sorting ensures sessions are processed chronologically. Once sorted, nearby sessions are the only candidates likely to satisfy the 12-hour condition. 3. For each session, compare it with the next chronological session.

We calculate:

next.session_start - current.session_end

If this difference is less than or equal to 12 hours, then the user qualifies. 4. Store qualifying users in a result set.

A set prevents duplicate user IDs if multiple valid pairs exist. 5. Return the result ordered by user_id.

Why it works

After sorting sessions chronologically within each (user_id, session_type) group, any valid close pair must appear among neighboring sessions. If a session does not satisfy the condition with its nearest following session, then sessions farther away will only produce larger gaps. Therefore, checking adjacent sessions is sufficient to detect every valid user.

Python Solution

from typing import List
import pandas as pd

def user_activities(sessions: pd.DataFrame) -> pd.DataFrame:
    # Sort sessions by user, type, and start time
    sessions = sessions.sort_values(
        by=["user_id", "session_type", "session_start"]
    )

    # Get next session start within each user/type group
    sessions["next_start"] = (
        sessions.groupby(["user_id", "session_type"])["session_start"]

		if sessions[i].SessionType != sessions[j].SessionType {
			return sessions[i].SessionType < sessions[j].SessionType
		}

		return sessions[i].SessionStart.Before(sessions[j].SessionStart)
	})

	resultSet := make(map[int]bool)

	for i := 0; i < len(sessions)-1; i++ {
		current := sessions[i]
		next := sessions[i+1]

		if current.UserID != next.UserID {
			continue
		}

		if current.SessionType != next.SessionType {
			continue
		}

		gap := next.SessionStart.Sub(current.SessionEnd)

		if gap.Hours() <= 12 {
			resultSet[current.UserID] = true
		}
	}

	result := make([]int, 0, len(resultSet))

	for userID := range resultSet {
		result = append(result, userID)
	}

	sort.Ints(result)

	return result
}

The Go implementation follows the same logical structure as the Python solution.

The primary difference is that Go does not provide built-in dataframe or window-function style operations. Instead, we manually sort the slice and scan adjacent elements.

Go’s time.Time and Duration APIs make time arithmetic straightforward. The expression:

next.SessionStart.Sub(current.SessionEnd)

returns a Duration, and .Hours() converts it into hours.

A map[int]bool is used to simulate a set for deduplicating qualifying users.

Worked Examples

Example 1

Input:

user_id	start	end	type
101	2023-11-01 08:00	09:00	Viewer
101	2023-11-01 10:00	11:00	Streamer
101	2023-11-02 09:00	10:00	Viewer

User 101, Viewer Sessions

After sorting:

Session	Start	End
1	Nov 1 08:00	Nov 1 09:00
5	Nov 2 09:00	Nov 2 10:00

Gap:

Nov 2 09:00 - Nov 1 09:00 = 24 hours

This exceeds 12 hours, so user 101 does not qualify.

User 102, Viewer Sessions

Session	Start	End
3	Nov 1 13:00	Nov 1 14:00
4	Nov 1 15:00	Nov 1 16:00
8	Nov 2 16:00	Nov 2 17:00

Compare sessions 3 and 4:

15:00 - 14:00 = 1 hour

This satisfies the condition, so user 102 is included.

User 103, Viewer Sessions

Session	Start	End
9	Nov 1 08:00	Nov 1 09:00
10	Nov 2 20:00	Nov 2 23:00
11	Nov 3 09:00	Nov 3 10:00

Compare sessions 10 and 11:

Nov 3 09:00 - Nov 2 23:00 = 10 hours

This satisfies the condition, so user 103 is included.

Final output:

| user_id |

|---|---|

| 102 |

| 103 |

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n log n)	Sorting dominates the runtime
Space	O(n)	Extra storage for shifted values or result structures

The expensive operation is sorting the sessions chronologically within each user/type grouping. After sorting, only a single linear scan is required to compare adjacent sessions.

The auxiliary space comes from storing intermediate structures such as shifted columns in pandas or hash sets in Go.

Test Cases

from typing import List
import pandas as pd

def run_solution(df):
    return user_activities(df)

# Example case
df1 = pd.DataFrame([
    [101, "2023-11-01 08:00:00", "2023-11-01 09:00:00", 1, "Viewer"],
    [101, "2023-11-01 10:00:00", "2023-11-01 11:00:00", 2, "Streamer"],
    [102, "2023-11-01 13:00:00", "2023-11-01 14:00:00", 3, "Viewer"],
    [102, "2023-11-01 15:00:00", "2023-11-01 16:00:00", 4, "Viewer"],
], columns=[
    "user_id",
    "session_start",
    "session_end",
    "session_id",
    "session_type"
])

df1["session_start"] = pd.to_datetime(df1["session_start"])
df1["session_end"] = pd.to_datetime(df1["session_end"])

result1 = run_solution(df1)
assert result1["user_id"].tolist() == [102]  # basic valid pair

# Exactly 12 hours gap
df2 = pd.DataFrame([
    [1, "2023-01-01 00:00:00", "2023-01-01 01:00:00", 1, "Viewer"],
    [1, "2023-01-01 13:00:00", "2023-01-01 14:00:00", 2, "Viewer"],
], columns=[
    "user_id",
    "session_start",
    "session_end",
    "session_id",
    "session_type"
])

df2["session_start"] = pd.to_datetime(df2["session_start"])
df2["session_end"] = pd.to_datetime(df2["session_end"])

result2 = run_solution(df2)
assert result2["user_id"].tolist() == [1]  # boundary condition

# More than 12 hours
df3 = pd.DataFrame([
    [1, "2023-01-01 00:00:00", "2023-01-01 01:00:00", 1, "Viewer"],
    [1, "2023-01-01 14:00:01", "2023-01-01 15:00:00", 2, "Viewer"],
], columns=[
    "user_id",
    "session_start",
    "session_end",
    "session_id",
    "session_type"
])

df3["session_start"] = pd.to_datetime(df3["session_start"])
df3["session_end"] = pd.to_datetime(df3["session_end"])

result3 = run_solution(df3)
assert result3.empty  # just beyond boundary

# Different session types
df4 = pd.DataFrame([
    [1, "2023-01-01 00:00:00", "2023-01-01 01:00:00", 1, "Viewer"],
    [1, "2023-01-01 02:00:00", "2023-01-01 03:00:00", 2, "Streamer"],
], columns=[
    "user_id",
    "session_start",
    "session_end",
    "session_id",
    "session_type"
])

df4["session_start"] = pd.to_datetime(df4["session_start"])
df4["session_end"] = pd.to_datetime(df4["session_end"])

result4 = run_solution(df4)
assert result4.empty  # same user but different types

# Multiple qualifying pairs
df5 = pd.DataFrame([
    [1, "2023-01-01 00:00:00", "2023-01-01 01:00:00", 1, "Viewer"],
    [1, "2023-01-01 02:00:00", "2023-01-01 03:00:00", 2, "Viewer"],
    [1, "2023-01-01 04:00:00", "2023-01-01 05:00:00", 3, "Viewer"],
], columns=[
    "user_id",
    "session_start",
    "session_end",
    "session_id",
    "session_type"
])

df5["session_start"] = pd.to_datetime(df5["session_start"])
df5["session_end"] = pd.to_datetime(df5["session_end"])

result5 = run_solution(df5)
assert result5["user_id"].tolist() == [1]  # deduplication check

Test Summary

Test	Why
Basic valid pair	Verifies normal successful detection
Exactly 12 hours	Confirms boundary is inclusive
More than 12 hours	Verifies strict rejection beyond limit
Different session types	Ensures types are not mixed
Multiple qualifying pairs	Confirms duplicate users are removed

Edge Cases

Exactly 12 Hours Between Sessions

A very common bug is using a strict < 12 comparison instead of <= 12.

The problem explicitly says the maximum allowed gap is 12 hours, meaning exactly 12 hours must qualify. The implementation correctly uses:

gap_hours <= 12

so boundary cases are handled properly.

Sessions of Different Types

A user may have sessions very close together but with different session types. These must not count.

For example:

Viewer at 10:00
Streamer at 11:00

Even though the time gap is small, the types differ. The algorithm avoids this mistake by partitioning sessions using both:

(user_id, session_type)

This guarantees Viewer and Streamer sessions are processed independently.

Unsorted Input Data

The input rows are not guaranteed to arrive chronologically.

A naive implementation that only checks adjacent rows in the original order could miss valid pairs entirely. The solution explicitly sorts sessions before processing:

sort_values(by=["user_id", "session_type", "session_start"])

This ensures chronological comparisons are always correct.

Multiple Valid Pairs for the Same User

A user may satisfy the condition many times. The output should still contain the user only once.

The implementation handles this using:

drop_duplicates()

in Python and a map-based set in Go. This guarantees each qualifying user appears exactly once in the result.