LeetCode 3672 - Sum of Weighted Modes in Subarrays

The problem asks us to calculate the sum of weighted modes for all subarrays of length k in a given array nums. Each subarray is contiguous and has exactly k elements. For a subarray, the mode is the element that occurs most frequently.

LeetCode Problem 3672

Difficulty: 🟡 Medium
Topics: Array, Hash Table, Sliding Window, Counting, Ordered Set

Solution

Problem Understanding

The problem asks us to calculate the sum of weighted modes for all subarrays of length k in a given array nums. Each subarray is contiguous and has exactly k elements. For a subarray, the mode is the element that occurs most frequently. If multiple elements share the same highest frequency, the smallest element is chosen as the mode. The weight of a subarray is defined as the product of the mode and its frequency in that subarray. The goal is to compute the sum of all subarray weights.

The input is an integer array nums of length up to 10^5 and an integer k such that 1 <= k <= len(nums). Each element in nums can be as large as 10^5. These constraints imply that a naive approach that enumerates all subarrays and counts frequencies from scratch will be too slow, because it would require O(n * k) operations.

Important edge cases include subarrays where all elements are unique (every element is a mode with frequency 1), subarrays where multiple elements have the same maximum frequency (smallest element must be chosen), and the case where k = 1 (each element is its own mode).

Approaches

The brute-force approach involves iterating through all subarrays of length k. For each subarray, we count the frequency of each element, find the mode according to the tie-breaking rule, and compute its weight. Finally, we sum the weights. This approach is correct but inefficient because counting frequencies for each subarray individually leads to O(n * k) time complexity, which is too slow for the given constraints.

The optimal approach relies on a sliding window with frequency tracking. As we slide the window by one element at a time, we update the frequencies of the element leaving the window and the element entering the window. To efficiently determine the mode at any time, we maintain a map from frequency to an ordered set of elements with that frequency. This allows quick access to the current maximum frequency and the smallest element among those with that frequency. This reduces the time complexity to roughly O(n log k) if we use a balanced BST or ordered map to maintain the frequency sets, because updates and queries are logarithmic in the number of unique elements in the current window.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * k) O(k) Count frequencies for each subarray individually. Too slow for n = 10^5.
Optimal O(n log k) O(k) Sliding window with frequency map and ordered sets to track the mode efficiently.

Algorithm Walkthrough

  1. Initialize a frequency map freq to count occurrences of each element in the current window.
  2. Initialize another map freq_set that maps each frequency to a set of elements that occur with that frequency. This allows efficient retrieval of the smallest element among those with maximum frequency.
  3. Slide a window of length k across nums:
  • For the incoming element at the right end, increment its count in freq and update freq_set accordingly.
  • For the outgoing element at the left end (once the window exceeds size k), decrement its count in freq and update freq_set accordingly.
  1. At each step after adjusting the window, find the maximum frequency from freq_set and pick the smallest element from the corresponding set. Multiply the mode by its frequency to get the weight.
  2. Accumulate the weights into a running total.
  3. Return the total sum after processing all windows.

Why it works: The sliding window ensures each subarray of length k is considered exactly once. The frequency map and reverse frequency-to-elements map maintain the current mode efficiently, respecting the tie-breaking rule. This guarantees that the correct weight is calculated for each subarray.

Python Solution

from collections import defaultdict
from sortedcontainers import SortedSet
from typing import List

class Solution:
    def modeWeight(self, nums: List[int], k: int) -> int:
        freq = defaultdict(int)
        freq_set = defaultdict(SortedSet)
        total = 0
        
        for i, num in enumerate(nums):
            # Add new element
            old_count = freq[num]
            if old_count > 0:
                freq_set[old_count].discard(num)
            freq[num] += 1
            freq_set[freq[num]].add(num)
            
            # Remove element outside window
            if i >= k:
                out_num = nums[i - k]
                count_out = freq[out_num]
                freq_set[count_out].discard(out_num)
                freq[out_num] -= 1
                if freq[out_num] > 0:
                    freq_set[freq[out_num]].add(out_num)
            
            # Calculate weight when window is ready
            if i >= k - 1:
                max_freq = max(freq_set.keys())
                mode = freq_set[max_freq][0]  # smallest element with max frequency
                total += mode * max_freq
        
        return total

Implementation Walkthrough: We use freq to count occurrences and freq_set (a SortedSet) to quickly retrieve the smallest element with maximum frequency. When the window slides, we update both structures. After the window reaches size k, we compute the mode weight and accumulate it. The SortedSet ensures the tie-breaking rule is respected efficiently.

Go Solution

package main

import (
	"container/heap"
)

func modeWeight(nums []int, k int) int64 {
	type Element struct {
		val, freq int
	}
	
	n := len(nums)
	freq := make(map[int]int)
	freqSet := make(map[int]map[int]struct{})
	var total int64

	add := func(num int) {
		old := freq[num]
		if old > 0 {
			delete(freqSet[old], num)
			if len(freqSet[old]) == 0 {
				delete(freqSet, old)
			}
		}
		freq[num]++
		newF := freq[num]
		if _, ok := freqSet[newF]; !ok {
			freqSet[newF] = make(map[int]struct{})
		}
		freqSet[newF][num] = struct{}{}
	}

	remove := func(num int) {
		old := freq[num]
		delete(freqSet[old], num)
		if len(freqSet[old]) == 0 {
			delete(freqSet, old)
		}
		freq[num]--
		if freq[num] > 0 {
			if _, ok := freqSet[freq[num]]; !ok {
				freqSet[freq[num]] = make(map[int]struct{})
			}
			freqSet[freq[num]][num] = struct{}{}
		}
	}

	minElement := func(m map[int]struct{}) int {
		min := int(1e9)
		for val := range m {
			if val < min {
				min = val
			}
		}
		return min
	}

	for i := 0; i < n; i++ {
		add(nums[i])
		if i >= k {
			remove(nums[i-k])
		}
		if i >= k-1 {
			maxFreq := 0
			for f := range freqSet {
				if f > maxFreq {
					maxFreq = f
				}
			}
			mode := minElement(freqSet[maxFreq])
			total += int64(mode * maxFreq)
		}
	}

	return total
}

Go-specific Notes: Go does not have a native SortedSet. We use a map[int]struct{} for each frequency and manually find the smallest element when needed. Integer overflow is handled by using int64 for the total sum.

Worked Examples

Example 1: nums = [1,2,2,3], k = 3

Window freq map freq_set mode max_freq weight
[1,2,2] {1:1,2:2} {1:{1},2:{2}} 2 2 4
[2,2,3] {2:2,3:1} {2:{2},1:{3}} 2 2 4

Sum = 8

Example 2: nums = [1,2,1,2], k = 2

Window freq map freq_set mode max_freq weight
[1,2] {1:1,2:1} {1:{1,2}} 1 1 1
[2,1] {2:1,1:1} {1:{1,2}} 1 1 1
[1,2] {1:1,2:1} {1:{1,2}} 1 1 1

Sum = 3

Example 3: nums = [4,3,4,3], k = 3

Window freq map freq_set mode max_freq weight
[4,3,4] {4:2,3:1} {2:{4},1:{3}} 4 2 8