LeetCode 1152 - Analyze User Website Visit Pattern

Here is a comprehensive, detailed technical solution guide for LeetCode 1152 - Analyze User Website Visit Pattern following your requested format exactly. The problem asks us to analyze user website visit sequences to determine the most common pattern of length three.

LeetCode Problem 1152

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Sorting

Solution

Here is a comprehensive, detailed technical solution guide for LeetCode 1152 - Analyze User Website Visit Pattern following your requested format exactly.

Problem Understanding

The problem asks us to analyze user website visit sequences to determine the most common pattern of length three. We are given three arrays of equal length: username, timestamp, and website. Each index i represents a visit by username[i] to website[i] at timestamp[i].

A pattern is any sequence of three websites, and the score of a pattern is the number of unique users who visited these three websites in that order, regardless of whether the visits are contiguous. For example, if a user visited "a", "b", "a", "c", the 3-sequence pattern ("a", "b", "a") counts for that user.

The task is to find the pattern with the largest score. If multiple patterns tie in score, we return the lexicographically smallest one.

Important constraints and observations: the input size is small (username.length <= 50), all tuples are unique, each user visits at least three websites, and usernames and websites are lowercase strings. These constraints allow us to consider all 3-sequence patterns per user without hitting performance issues.

Edge cases include users with repeated visits, patterns with repeated websites, and multiple users sharing the same pattern.

Approaches

Brute-Force Approach:

We could attempt to generate all possible patterns of length three across all visits without considering user sequences. This would be incorrect, because the order and per-user sequences matter. Furthermore, counting occurrences naively could double-count users or fail to respect the order of visits. This approach is conceptually simple but error-prone and unnecessary given the constraints.

Optimal Approach:

The key observation is that we can first sort all visits by timestamp, then group visits by user. For each user, we generate all 3-sequence patterns from their ordered visit history. We then track which users visited each pattern using a set to avoid duplicate counting per user. Finally, we compute the score for each pattern and select the one with the largest score, using lexicographic comparison to break ties.

The sorted timestamp ensures correct order for each user. Using a set per pattern guarantees that a user is only counted once for each pattern.

Approach Time Complexity Space Complexity Notes
Brute Force O(N^3) O(N^3) Generate all possible patterns globally, ignores user sequences, inefficient and incorrect
Optimal O(U * L^3) O(P + U*L) Sort visits by timestamp, generate all 3-sequences per user (L=visits per user), P=number of unique patterns, U=number of users

Algorithm Walkthrough

  1. Combine the username, timestamp, and website arrays into a list of tuples (timestamp, username, website) to facilitate sorting by timestamp.
  2. Sort the list by timestamp to ensure that each user's visits are in chronological order.
  3. Group the websites by user into a dictionary, where the key is username and the value is the ordered list of websites that user visited.
  4. Initialize a dictionary to store patterns as keys and sets of users as values. The set ensures that each user contributes at most once to the score of a pattern.
  5. For each user, generate all 3-sequence combinations from their ordered visits. Use a set to avoid duplicate sequences per user.
  6. Update the pattern dictionary: for each 3-sequence, add the current user to the corresponding set.
  7. Iterate over all patterns and find the one with the largest number of unique users (length of the set). In case of a tie, select the lexicographically smallest pattern.

Why it works:

The algorithm guarantees correct scores because each user is counted only once per pattern and the chronological order is preserved. Generating all 3-sequences per user ensures that all possible patterns are considered. Lexicographic comparison guarantees correct tie-breaking.

Python Solution

from collections import defaultdict
from itertools import combinations
from typing import List

class Solution:
    def mostVisitedPattern(self, username: List[str], timestamp: List[int], website: List[str]) -> List[str]:
        # Step 1: Combine and sort by timestamp
        visits = sorted(zip(timestamp, username, website))
        
        # Step 2: Group websites by user in chronological order
        user_websites = defaultdict(list)
        for _, user, site in visits:
            user_websites[user].append(site)
        
        # Step 3: Count patterns per user
        pattern_users = defaultdict(set)
        for user, sites in user_websites.items():
            # Generate all 3-sequences for the current user
            unique_patterns = set(combinations(sites, 3))
            for pattern in unique_patterns:
                pattern_users[pattern].add(user)
        
        # Step 4: Find the pattern with max users, break ties lexicographically
        max_count = 0
        result = tuple()
        for pattern, users in pattern_users.items():
            if len(users) > max_count or (len(users) == max_count and pattern < result):
                max_count = len(users)
                result = pattern
        
        return list(result)

Implementation Explanation:

We first sort visits by timestamp and group them by user to maintain order. combinations(sites, 3) generates all 3-length sequences while avoiding duplicates via a set. pattern_users maps each pattern to the set of users who visited it. Finally, we iterate over all patterns to find the highest-scoring one, using tuple comparison for lexicographic ordering.

Go Solution

package main

import (
	"sort"
)

func mostVisitedPattern(username []string, timestamp []int, website []string) []string {
	type visit struct {
		time int
		user string
		site string
	}
	n := len(username)
	visits := make([]visit, n)
	for i := 0; i < n; i++ {
		visits[i] = visit{timestamp[i], username[i], website[i]}
	}
	
	// Sort by timestamp
	sort.Slice(visits, func(i, j int) bool {
		return visits[i].time < visits[j].time
	})
	
	// Group websites per user
	userWebsites := make(map[string][]string)
	for _, v := range visits {
		userWebsites[v.user] = append(userWebsites[v.user], v.site)
	}
	
	// Count patterns per user
	patternUsers := make(map[[3]string]map[string]struct{})
	for user, sites := range userWebsites {
		uniquePatterns := make(map[[3]string]struct{})
		for i := 0; i < len(sites)-2; i++ {
			for j := i + 1; j < len(sites)-1; j++ {
				for k := j + 1; k < len(sites); k++ {
					p := [3]string{sites[i], sites[j], sites[k]}
					uniquePatterns[p] = struct{}{}
				}
			}
		}
		for pattern := range uniquePatterns {
			if _, ok := patternUsers[pattern]; !ok {
				patternUsers[pattern] = make(map[string]struct{})
			}
			patternUsers[pattern][user] = struct{}{}
		}
	}
	
	// Find pattern with max users
	var result [3]string
	maxCount := 0
	for pattern, users := range patternUsers {
		count := len(users)
		if count > maxCount {
			maxCount = count
			result = pattern
		} else if count == maxCount {
			for i := 0; i < 3; i++ {
				if pattern[i] < result[i] {
					result = pattern
					break
				} else if pattern[i] > result[i] {
					break
				}
			}
		}
	}
	return []string{result[0], result[1], result[2]}
}

Go-Specific Notes:

Go lacks a built-in combination generator, so we use nested loops for 3-sequence generation. Sets are implemented using maps with empty structs. Lexicographic comparison is done element-wise on the [3]string array.

Worked Examples

Example 1:

username = ["joe","joe","joe","james","james","james","james","mary","mary","mary"]
timestamp = [1,2,3,4,5,6,7,8,9,10]
website = ["home","about","career","home","cart","maps","home","home","about","career"]
  1. Sort by timestamp → same order.
  2. Group by user:

joe → ["home","about","career"]

james → ["home","cart","maps","home"]

mary → ["home","about","career"] 3. Generate 3-sequences per user:

joe → {("home","about","career")}

james → {("home","cart","maps"), ("home","cart","home"), ("home","maps","home")}

mary → {("home","about","career")} 4. Count users per pattern:

("home","about","career") → {joe,mary} → 2 users

others → 1 user 5. Max score pattern → ("home","about","career")