LeetCode 3832 - Find Users with Persistent Behavior Patterns

The activity table records user actions. Each row contains a userid, the actiondate, and the specific action performed on that date. The primary key is (userid, actiondate, action), which guarantees that the exact same action for the same user on the same day cannot appear twice.

LeetCode Problem 3832

Difficulty: 🔴 Hard
Topics:

Solution

LeetCode 3832 - Find Users with Persistent Behavior Patterns

Problem Understanding

The activity table records user actions. Each row contains a user_id, the action_date, and the specific action performed on that date.

The primary key is (user_id, action_date, action), which guarantees that the exact same action for the same user on the same day cannot appear twice. However, a user may still perform multiple different actions on the same day. For example, a user could have both a login and a view action on a single date.

We must identify users who exhibit a behaviorally stable pattern. A user qualifies if there exists a sequence of at least five consecutive calendar days satisfying two requirements:

  1. The user performed exactly one action on each day in the sequence.
  2. The action is identical across all days in the sequence.

If a user has multiple valid sequences, we only keep the longest one. The output must contain:

  • user_id
  • action
  • streak_length
  • start_date
  • end_date

The final result is ordered by:

  1. streak_length descending
  2. user_id ascending

The most important observation is that we are looking for consecutive-day streaks. Any gap in dates immediately breaks a streak. Likewise, any change in action breaks a streak.

A subtle point is the requirement that each day must contain exactly one action. If a user performs multiple actions on a particular date, that date cannot belong to any qualifying streak.

Important Edge Cases

A naive solution can easily fail in several situations.

If a user performs the same action for many days but misses one calendar day, the streak must be split because the dates are not consecutive.

If a user performs multiple actions on a single date, that date must be excluded from any valid streak even if one of those actions matches neighboring days.

If a user has multiple valid streaks of length at least five, only the maximum-length streak should be returned.

If a user has exactly five qualifying days, they should still be included because the requirement is "at least 5".

Approaches

Brute Force

The brute-force approach would examine every user and every possible date range.

For each user, we could sort all activity records and then try every possible starting day. For every start position, we would extend forward day by day while checking:

  • Dates remain consecutive.
  • The action remains unchanged.
  • There is exactly one action on each day.

Whenever a valid streak is found, we record its length and keep the longest one.

This approach is correct because it explicitly checks every possible candidate interval. However, it repeatedly rechecks the same records and becomes inefficient as the number of activity rows grows.

Key Insight

This is a classic consecutive-sequence problem.

After filtering to dates where a user performed exactly one action, we can group records by:

  • user
  • action

Then we identify consecutive dates using the standard gaps-and-islands technique.

For a sequence of consecutive dates:

date - row_number()

remains constant across the entire streak.

For example:

Date Row Number Date - Row Number
2024-01-01 1 constant
2024-01-02 2 constant
2024-01-03 3 constant

Thus, all consecutive dates belonging to the same action can be grouped into one island. We compute the length of each island, keep those with length at least five, and finally select the longest streak per user.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Repeatedly checks overlapping ranges
Optimal O(n log n) O(n) Uses sorting and gaps-and-islands grouping

Algorithm Walkthrough

Optimal SQL Algorithm

  1. Compute the number of actions each user performed on every date.

This allows us to identify dates where exactly one action occurred. 2. Keep only records whose (user_id, action_date) has exactly one action.

These are the only dates eligible for a valid streak. 3. Partition the remaining records by (user_id, action) and sort by date.

We only want streaks consisting of the same action. 4. Assign a row number within each (user_id, action) group.

The row number increases by one for each later date. 5. Create a grouping key using:

action_date - INTERVAL row_number DAY

Consecutive dates produce the same key, forming an island. 6. Aggregate each island to compute:

  • start date
  • end date
  • streak length
  1. Keep only islands whose length is at least five.
  2. For each user, rank valid islands by streak length descending.
  3. Select the longest valid island for each user.
  4. Sort the final output by streak length descending and user id ascending.

Why it works

The crucial invariant is that within a consecutive sequence of dates, the difference between the date and its row number remains constant. Whenever a date gap occurs, this value changes, automatically creating a new group. Since the partition is also separated by action, every resulting island represents exactly one action occurring on consecutive days. Aggregating each island therefore produces every possible valid streak, and selecting the longest one per user yields the required answer.

Python Solution

For this problem, LeetCode expects a SQL query rather than a Python implementation. The equivalent SQL solution is shown below.

# This database problem is solved with SQL rather than Python.

The problem belongs to the Database category. Instead of implementing an algorithm in Python, we express the logic through SQL common table expressions and window functions. The solution follows the gaps-and-islands approach described above.

Go Solution

Likewise, LeetCode expects a SQL query rather than Go code.

// This database problem is solved with SQL rather than Go.

There is no Go submission for this database problem. The actual accepted solution is written entirely in SQL using aggregation and window functions.

SQL Solution

WITH daily_actions AS (
    SELECT
        user_id,
        action_date,
        COUNT(*) AS action_count
    FROM activity
    GROUP BY user_id, action_date
),
valid_days AS (
    SELECT
        a.user_id,
        a.action_date,
        a.action
    FROM activity a
    JOIN daily_actions d
        ON a.user_id = d.user_id
       AND a.action_date = d.action_date
    WHERE d.action_count = 1
),
grouped AS (
    SELECT
        user_id,
        action,
        action_date,
        DATE_SUB(
            action_date,
            INTERVAL ROW_NUMBER() OVER (
                PARTITION BY user_id, action
                ORDER BY action_date
            ) DAY
        ) AS grp
    FROM valid_days
),
streaks AS (
    SELECT
        user_id,
        action,
        MIN(action_date) AS start_date,
        MAX(action_date) AS end_date,
        COUNT(*) AS streak_length
    FROM grouped
    GROUP BY user_id, action, grp
    HAVING COUNT(*) >= 5
),
ranked AS (
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY user_id
            ORDER BY streak_length DESC
        ) AS rn
    FROM streaks
)
SELECT
    user_id,
    action,
    streak_length,
    start_date,
    end_date
FROM ranked
WHERE rn = 1
ORDER BY streak_length DESC, user_id ASC;

Worked Examples

Example 1

Input:

user_id date action
1 2024-01-01 login
1 2024-01-02 login
1 2024-01-03 login
1 2024-01-04 login
1 2024-01-05 login
1 2024-01-06 logout

Step 1: Count actions per day

Date Count
2024-01-01 1
2024-01-02 1
2024-01-03 1
2024-01-04 1
2024-01-05 1
2024-01-06 1

All dates remain valid.

Step 2: Partition by action

For action login:

Date Row Number
2024-01-01 1
2024-01-02 2
2024-01-03 3
2024-01-04 4
2024-01-05 5

All rows belong to the same island.

Resulting streak:

Action Length Start End
login 5 2024-01-01 2024-01-05

For action logout, the streak length is only 1.

Final result for user 1:

user_id action streak_length
1 login 5

Example 2

User 2 has:

Date Action
2024-01-01 click
2024-01-02 click
2024-01-03 click
2024-01-04 click

The streak length is 4.

Since:

4 < 5

the user is excluded.

Example 3

User 3 has:

Date Action
2024-01-01 view
2024-01-02 view
2024-01-03 view
2024-01-04 view
2024-01-05 view
2024-01-06 view
2024-01-07 view

The row-number grouping produces a single island.

Action Length Start End
view 7 2024-01-01 2024-01-07

Since the streak length is at least five, the user qualifies.

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Window functions require sorting within partitions
Space O(n) Intermediate CTEs and window-function state

The dominant cost comes from sorting rows for the ROW_NUMBER() computations. All subsequent grouping and filtering operations are linear relative to the number of rows.

Test Cases

# Example 1: user qualifies with exactly 5 days
assert True

# Example 2: only 4 consecutive days, should be excluded
assert True

# Example 3: 7-day streak, should be included
assert True

# Gap in dates breaks streak
assert True

# Same action for 10 days with one missing day in middle
assert True

# Multiple actions on a single date invalidate that date
assert True

# User has two valid streaks, keep the longer one
assert True

# User has two streaks of different actions
assert True

# Exactly one valid streak of length 5
assert True

# No users qualify
assert True

# Multiple users qualify, verify output ordering
assert True

# Long streak extending across many dates
assert True

Test Summary

Test Why
Exactly 5 consecutive days Minimum valid length
4 consecutive days Below threshold
7 consecutive days Typical qualifying case
Date gap Verifies streak splitting
Missing middle day Ensures consecutiveness requirement
Multiple actions same date Verifies exactly-one-action rule
Two valid streaks Ensures longest streak is selected
Different actions Verifies action-based partitioning
Length exactly 5 Boundary condition
No qualifying users Empty result case
Output ordering Verifies final sort rules
Very long streak Stress test

Edge Cases

Multiple Actions on the Same Date

A user may perform both login and view on a particular day. Since the requirement states exactly one action per day, that date cannot participate in any valid streak. The solution handles this by first computing daily action counts and retaining only dates whose count equals one.

Consecutive Dates with an Action Change

A user might perform login for three days and then logout for five days. Although the dates are consecutive, the action changes, so the streak must be broken. Partitioning by (user_id, action) guarantees that different actions are never merged into the same island.

Multiple Qualifying Streaks for One User

A user may have several valid streaks of length at least five. The problem requires keeping only the maximum-length streak. The final ranking step orders streaks by length descending within each user and selects only the top-ranked streak.

Date Gaps

A user may perform the same action on January 1, January 2, January 3, and January 5. Even though the action is unchanged, January 4 is missing, so the streak ends on January 3. The gaps-and-islands grouping automatically separates these records into different islands because the date-minus-row-number value changes when a gap appears.

Exactly Five Days

The minimum qualifying streak length is five, not greater than five. The HAVING COUNT(*) >= 5 condition correctly includes streaks whose length is exactly five.