LeetCode 1709 - Biggest Window Between Visits

The UserVisits table stores records of when users visited a retailer. Each row contains a userid and a visitdate. A user may appear multiple times because they can visit on many different days.

LeetCode Problem 1709

Difficulty: 🟡 Medium
Topics: Database

Solution

Problem Understanding

The UserVisits table stores records of when users visited a retailer. Each row contains a user_id and a visit_date. A user may appear multiple times because they can visit on many different days.

The goal is to compute, for every user, the largest number of days between consecutive visits. For the final visit of a user, the comparison should be made against the fixed date '2021-01-01'.

In other words, for each user:

  1. Sort all visit dates in chronological order.
  2. Compute the gap in days between each visit and the next visit.
  3. For the last visit, compute the gap between that visit and '2021-01-01'.
  4. Return the maximum gap for that user.

The output should contain:

  • user_id
  • biggest_window, which is the maximum number of days between consecutive visits or between the last visit and '2021-01-01'

The result must be ordered by user_id.

The table does not have a primary key, which means duplicate rows may exist. This is important because duplicate visit dates can produce a gap of 0 days, and the solution must still handle them correctly.

The problem is fundamentally about comparing each row with the next chronological row for the same user. This naturally suggests using SQL window functions, especially LEAD, which allows us to look ahead to the next row inside a partition.

Some important edge cases include:

  • A user with only one visit
  • Duplicate visit dates
  • Visits already close to '2021-01-01'
  • Multiple users with interleaved visit records
  • Unsorted input data

A naive solution might fail if it assumes the input is already ordered or if it forgets to compare the last visit against '2021-01-01'.

Approaches

Brute Force Approach

A brute force solution would process each user independently by:

  1. Extracting all visits for that user
  2. Sorting the visits by date
  3. Comparing every visit with the next one
  4. Tracking the maximum gap

This approach is logically correct because sorting guarantees that consecutive dates are compared in chronological order.

However, implementing this manually in SQL becomes cumbersome. One possibility would be using self joins to find the next visit date for every row. For each visit, we would search for the minimum later visit date belonging to the same user.

That introduces expensive repeated scans and joins, especially for large datasets.

Optimal Approach

The key insight is that SQL window functions are designed exactly for this type of row-to-row comparison.

Using LEAD(visit_date) allows us to access the next visit date for the same user without needing complicated joins.

The process becomes:

  1. Partition rows by user_id
  2. Order each partition by visit_date
  3. Use LEAD to fetch the next visit
  4. If no next visit exists, substitute '2021-01-01'
  5. Compute date differences
  6. Take the maximum difference for each user

This approach is both cleaner and more efficient.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Self joins repeatedly search for the next visit
Optimal O(n log n) O(n) Uses sorting plus window functions

Algorithm Walkthrough

  1. First, partition all rows by user_id.

This ensures that each user's visits are processed independently. Visits from different users must never be compared with each other. 2. Within each user partition, sort rows by visit_date.

Chronological ordering is necessary because we only care about gaps between consecutive visits in time order. 3. Use the LEAD() window function to retrieve the next visit date.

For every row, LEAD(visit_date) returns the next visit date belonging to the same user. 4. Handle the final visit for each user.

The last visit has no next row, so LEAD() returns NULL. In that case, replace the missing value with '2021-01-01'. 5. Compute the day difference.

Use DATEDIFF(next_date, visit_date) to calculate the number of days between visits. 6. Group by user_id and compute the maximum gap.

Since the problem asks for the biggest window, we take the maximum difference for each user. 7. Order the final result by user_id.

Why it works

The algorithm works because every visit is compared exactly once with its immediate chronological successor. Since the visits are sorted, every possible valid window is examined. Taking the maximum of those windows guarantees that the largest gap is returned for each user.

Python Solution

Although this is a Database problem and the intended solution is SQL, the following Python implementation demonstrates the same logic programmatically.

from collections import defaultdict
from datetime import datetime
from typing import List, Dict

class Solution:
    def biggestWindow(self, userVisits: List[List[str]]) -> List[List[int]]:
        visits_by_user = defaultdict(list)

        for user_id, visit_date in userVisits:
            visits_by_user[user_id].append(
                datetime.strptime(visit_date, "%Y-%m-%d")
            )

        today = datetime.strptime("2021-01-01", "%Y-%m-%d")

        result = []

        for user_id in sorted(visits_by_user.keys()):
            visits = sorted(visits_by_user[user_id])

            max_gap = 0

            for i in range(len(visits) - 1):
                gap = (visits[i + 1] - visits[i]).days
                max_gap = max(max_gap, gap)

            last_gap = (today - visits[-1]).days
            max_gap = max(max_gap, last_gap)

            result.append([user_id, max_gap])

        return result

The implementation begins by grouping visit dates by user. A defaultdict(list) is used because it allows visits to be appended efficiently without checking whether the key already exists.

Each visit date string is converted into a datetime object so that date subtraction can be performed directly.

After grouping, the algorithm processes users in sorted order to satisfy the required output ordering.

For each user:

  • The visit dates are sorted chronologically.
  • Consecutive visits are compared to compute gaps.
  • The last visit is compared against '2021-01-01'.
  • The maximum gap is stored.

Finally, the result list is returned.

Go Solution

package main

import (
	"sort"
	"time"
)

type Solution struct{}

func (s Solution) biggestWindow(userVisits [][]string) [][]int {
	visitsByUser := make(map[int][]time.Time)

	for _, visit := range userVisits {
		userID := 0
		for _, ch := range visit[0] {
			userID = userID*10 + int(ch-'0')
		}

		date, _ := time.Parse("2006-01-02", visit[1])
		visitsByUser[userID] = append(visitsByUser[userID], date)
	}

	today, _ := time.Parse("2006-01-02", "2021-01-01")

	userIDs := make([]int, 0, len(visitsByUser))
	for userID := range visitsByUser {
		userIDs = append(userIDs, userID)
	}

	sort.Ints(userIDs)

	result := make([][]int, 0)

	for _, userID := range userIDs {
		visits := visitsByUser[userID]

		sort.Slice(visits, func(i, j int) bool {
			return visits[i].Before(visits[j])
		})

		maxGap := 0

		for i := 0; i < len(visits)-1; i++ {
			gap := int(visits[i+1].Sub(visits[i]).Hours() / 24)

			if gap > maxGap {
				maxGap = gap
			}
		}

		lastGap := int(today.Sub(visits[len(visits)-1]).Hours() / 24)

		if lastGap > maxGap {
			maxGap = lastGap
		}

		result = append(result, []int{userID, maxGap})
	}

	return result
}

The Go implementation follows the same algorithmic structure as the Python version.

One important difference is date handling. Go uses the time package, and dates are parsed using the reference format "2006-01-02".

Sorting is performed using sort.Slice, which allows custom comparison logic for time.Time values.

Gap calculations are done by subtracting timestamps and converting the duration from hours into days.

Go does not have Python's dynamic dictionaries or tuple unpacking, so explicit map management and slice operations are required.

SQL Solution

SELECT
    user_id,
    MAX(
        DATEDIFF(
            next_visit,
            visit_date
        )
    ) AS biggest_window
FROM (
    SELECT
        user_id,
        visit_date,
        IFNULL(
            LEAD(visit_date) OVER (
                PARTITION BY user_id
                ORDER BY visit_date
            ),
            '2021-01-01'
        ) AS next_visit
    FROM UserVisits
) AS visits
GROUP BY user_id
ORDER BY user_id;

The inner query uses LEAD() to retrieve the next visit date for each user.

If no next visit exists, IFNULL() substitutes '2021-01-01'.

The outer query computes date differences and selects the maximum value for every user.

Worked Examples

Example 1

Input:

+---------+------------+
| user_id | visit_date |
+---------+------------+
| 1       | 2020-11-28 |
| 1       | 2020-10-20 |
| 1       | 2020-12-03 |
| 2       | 2020-10-05 |
| 2       | 2020-12-09 |
| 3       | 2020-11-11 |
+---------+------------+

User 1

Sorted visits:

Index Visit Date
0 2020-10-20
1 2020-11-28
2 2020-12-03

Gap calculations:

Current Visit Next Visit Gap
2020-10-20 2020-11-28 39
2020-11-28 2020-12-03 5
2020-12-03 2021-01-01 29

Maximum gap = 39

User 2

Sorted visits:

Index Visit Date
0 2020-10-05
1 2020-12-09

Gap calculations:

Current Visit Next Visit Gap
2020-10-05 2020-12-09 65
2020-12-09 2021-01-01 23

Maximum gap = 65

User 3

Sorted visits:

Index Visit Date
0 2020-11-11

Gap calculations:

Current Visit Next Visit Gap
2020-11-11 2021-01-01 51

Maximum gap = 51

Final output:

+---------+---------------+
| user_id | biggest_window|
+---------+---------------+
| 1       | 39            |
| 2       | 65            |
| 3       | 51            |
+---------+---------------+

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Sorting dominates the runtime
Space O(n) Extra storage is used for grouped visits

The algorithm must sort visit dates for each user. Across all users, this results in an overall complexity of O(n log n). Window functions in SQL internally rely on similar sorting operations.

The space complexity is O(n) because all visits may need to be stored or buffered during processing.

Test Cases

from collections import defaultdict
from datetime import datetime

def biggest_window(user_visits):
    visits_by_user = defaultdict(list)

    for user_id, visit_date in user_visits:
        visits_by_user[user_id].append(
            datetime.strptime(visit_date, "%Y-%m-%d")
        )

    today = datetime.strptime("2021-01-01", "%Y-%m-%d")

    result = []

    for user_id in sorted(visits_by_user.keys()):
        visits = sorted(visits_by_user[user_id])

        max_gap = 0

        for i in range(len(visits) - 1):
            max_gap = max(
                max_gap,
                (visits[i + 1] - visits[i]).days
            )

        max_gap = max(
            max_gap,
            (today - visits[-1]).days
        )

        result.append([user_id, max_gap])

    return result

# Example case from problem statement
assert biggest_window([
    [1, "2020-11-28"],
    [1, "2020-10-20"],
    [1, "2020-12-03"],
    [2, "2020-10-05"],
    [2, "2020-12-09"],
    [3, "2020-11-11"]
]) == [[1, 39], [2, 65], [3, 51]]

# Single user with one visit
assert biggest_window([
    [1, "2020-12-31"]
]) == [[1, 1]]

# Duplicate visit dates
assert biggest_window([
    [1, "2020-12-01"],
    [1, "2020-12-01"]
]) == [[1, 31]]

# Visits already sorted
assert biggest_window([
    [1, "2020-01-01"],
    [1, "2020-06-01"],
    [1, "2020-12-01"]
]) == [[1, 152]]

# Multiple users with interleaved visits
assert biggest_window([
    [2, "2020-11-01"],
    [1, "2020-10-01"],
    [2, "2020-12-01"],
    [1, "2020-12-15"]
]) == [[1, 75], [2, 30]]

# Consecutive daily visits
assert biggest_window([
    [1, "2020-12-28"],
    [1, "2020-12-29"],
    [1, "2020-12-30"]
]) == [[1, 2]]
Test Why
Problem example Validates the main expected behavior
Single visit Ensures last visit comparison works
Duplicate dates Ensures zero-length gaps are handled
Already sorted visits Confirms sorting logic does not break ordered input
Interleaved users Verifies user partitioning correctness
Consecutive visits Tests very small gaps

Edge Cases

User With Only One Visit

A user may appear exactly once in the table. In this situation, there are no consecutive visit pairs. The only valid window is between the single visit date and '2021-01-01'.

This case can easily cause bugs if the implementation assumes that every row has a next visit. The solution handles this correctly by replacing missing LEAD() results with '2021-01-01'.

Duplicate Visit Dates

Since the table has no primary key, duplicate rows may exist. Two visits may occur on the exact same date, producing a window size of 0.

A naive implementation might incorrectly remove duplicates or assume strictly increasing dates. The solution keeps duplicates intact, and DATEDIFF() naturally returns 0 when dates are equal.

Unsorted Input

The input rows are not guaranteed to be ordered. If visits are processed in insertion order instead of chronological order, the computed gaps will be incorrect.

The solution explicitly sorts visits by visit_date using ORDER BY inside the window function, guaranteeing that comparisons are always made chronologically.

Last Visit Handling

The final visit for each user has no next visit row. Forgetting to compare against '2021-01-01' would miss potentially large windows.

The solution handles this using IFNULL(LEAD(...), '2021-01-01'), ensuring that every visit participates in exactly one window calculation.