LeetCode 1086 - High Five

The problem asks us to compute the top five average score for each student given a list of [ID, score] pairs. Each ID represents a unique student, and score represents a single score they received.

LeetCode Problem 1086

Difficulty: 🟒 Easy
Topics: Array, Hash Table, Sorting, Heap (Priority Queue)

Solution

Problem Understanding

The problem asks us to compute the top five average score for each student given a list of [ID, score] pairs. Each ID represents a unique student, and score represents a single score they received. The goal is to return a list of [ID, topFiveAverage] pairs, where topFiveAverage is calculated by taking the sum of the student’s five highest scores and performing integer division by 5. The final output must be sorted by student ID in ascending order.

The input constraints are fairly manageable: there are at most 1000 score entries and student IDs range between 1 and 1000. Each student has at least five scores, so we do not need to handle cases with fewer than five scores. Key edge cases include students having exactly five scores, students with multiple identical top scores, and scores in arbitrary order.

The problem guarantees that integer division is applied, meaning that fractional averages are truncated. This detail is critical to get the correct output. Another subtlety is that scores might be added in any order, so sorting or maintaining a priority queue of top scores will be necessary.

Approaches

Brute Force Approach

The simplest approach is to group all scores by student ID, sort each group in descending order, pick the top five scores, sum them, and divide by five. This works because sorting ensures we always select the five largest scores. However, this requires sorting for every student, which is unnecessary when we only care about the top five scores. For n total scores, this leads to time complexity O(n log n) per student in the worst case.

Optimal Approach

The key observation is that we do not need to sort all scores for each student. We only need the top five scores. A min-heap (priority queue) of size 5 per student is sufficient. For each score of a student, we push it into the heap. If the heap exceeds size 5, we pop the smallest element. This ensures the heap always contains the top five scores. Once all scores are processed, we sum the heap and perform integer division by five to get the top five average.

This approach avoids unnecessary sorting and keeps the time complexity linear with respect to the number of scores while maintaining a fixed-size heap for each student.

Approach Time Complexity Space Complexity Notes
Brute Force O(n log n) O(n) Group all scores, sort each group, take top five
Optimal O(n log 5) β‰ˆ O(n) O(s) Maintain a min-heap of size 5 per student, s = number of unique students

Algorithm Walkthrough

  1. Initialize a dictionary (hash map) student_scores to store a min-heap for each student ID.

  2. Iterate through each [ID, score] pair in the input items.

  3. For each ID, push the score into student_scores[ID]. If the heap size exceeds 5, pop the smallest score to ensure only the top five scores remain.

  4. After processing all scores, initialize an empty list result.

  5. For each student ID in sorted order:

  6. Take the heap of scores for that student.

  7. Sum all elements in the heap and divide by 5 using integer division.

  8. Append [ID, topFiveAverage] to result.

  9. Return result.

Why it works: By maintaining a min-heap of size 5, we guarantee that only the five largest scores are kept for each student. Sorting the keys at the end ensures the output is in ascending ID order. Integer division is applied consistently to meet the problem specification.

Python Solution

from typing import List
import heapq

class Solution:
    def highFive(self, items: List[List[int]]) -> List[List[int]]:
        student_scores = {}
        
        for ID, score in items:
            if ID not in student_scores:
                student_scores[ID] = []
            heapq.heappush(student_scores[ID], score)
            if len(student_scores[ID]) > 5:
                heapq.heappop(student_scores[ID])
        
        result = []
        for ID in sorted(student_scores):
            top_five_avg = sum(student_scores[ID]) // 5
            result.append([ID, top_five_avg])
        
        return result

The code first creates a dictionary where each student ID maps to a min-heap of scores. For each score, it maintains only the top five. Sorting IDs ensures correct output order. Integer division is applied when computing the average.

Go Solution

package main

import (
    "container/heap"
    "sort"
)

type MinHeap []int

func (h MinHeap) Len() int           { return len(h) }
func (h MinHeap) Less(i, j int) bool { return h[i] < h[j] }
func (h MinHeap) Swap(i, j int)      { h[i], h[j] = h[j], h[i] }

func (h *MinHeap) Push(x interface{}) {
    *h = append(*h, x.(int))
}

func (h *MinHeap) Pop() interface{} {
    old := *h
    n := len(old)
    x := old[n-1]
    *h = old[0 : n-1]
    return x
}

func highFive(items [][]int) [][]int {
    studentScores := make(map[int]*MinHeap)
    
    for _, item := range items {
        ID, score := item[0], item[1]
        if _, exists := studentScores[ID]; !exists {
            h := &MinHeap{}
            heap.Init(h)
            studentScores[ID] = h
        }
        heap.Push(studentScores[ID], score)
        if studentScores[ID].Len() > 5 {
            heap.Pop(studentScores[ID])
        }
    }
    
    ids := make([]int, 0, len(studentScores))
    for id := range studentScores {
        ids = append(ids, id)
    }
    sort.Ints(ids)
    
    result := make([][]int, 0, len(ids))
    for _, id := range ids {
        sum := 0
        for _, score := range *studentScores[id] {
            sum += score
        }
        result = append(result, []int{id, sum / 5})
    }
    
    return result
}

Go handles heaps using the container/heap package. The approach is similar to Python, using a min-heap per student to track the top five scores and summing them at the end. Sorting is done with sort.Ints.

Worked Examples

Example 1

Input: [[1,91],[1,92],[2,93],[2,97],[1,60],[2,77],[1,65],[1,87],[1,100],[2,100],[2,76]]

Step-by-step heap states:

ID Heap State After Each Insert
1 [91] β†’ [91,92] β†’ [91,92,60] β†’ [91,92,60,65] β†’ [91,92,60,65,87] β†’ [92,87,91,65,100]
2 [93] β†’ [93,97] β†’ [93,97,77] β†’ [93,97,77,100] β†’ [93,97,77,100,76]

Top five sums: 1: 100+92+91+87+65=435, 2: 100+97+93+77+76=443.

Integer division by 5: 1: 87, 2: 88.

Output: [[1,87],[2,88]]

Example 2

Input: [[1,100],[7,100],[1,100],[7,100],[1,100],[7,100],[1,100],[7,100],[1,100],[7,100]]

Each student only has five scores, all 100. Average is 100.

Output: [[1,100],[7,100]]

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each score is pushed/popped in a heap of max size 5, O(log 5) β‰ˆ O(1) per operation. Sorting unique IDs costs O(s log s), s ≀ n
Space O(s) Storing a min-heap of size 5 per student, s = number of unique students

The algorithm is efficient because we never sort more than 5 scores per student and only maintain fixed-size heaps.

Test Cases

# Provided examples
assert Solution().highFive([[1,91],[1,92],[2,93],[2,97],[1,60],[2,77],[1,65],[1,87],[1,100],[2,100],[2,76]]) == [[1,87],[2,88]] # mixed scores
assert Solution().highFive([[1,100],[7,100],[1,100],[7,100],[1,100],[7,100],[1,100],[7,100],[1,100],[7,100]]) == [[1,100],[7,100]] # all same scores

# Boundary cases
assert Solution().highFive([[1,0],[1,0],[1,0],[1,0],[1,0]]) == [[1,0]] # lowest scores
assert Solution().highFive([[1,