LeetCode 373 - Find K Pairs with Smallest Sums

Difficulty: 🟡 Medium
Topics: Array, Heap (Priority Queue)

Solution

Problem Understanding

The problem gives us two sorted integer arrays, nums1 and nums2, and asks us to return the k pairs with the smallest sums.

A pair is formed by taking one element from nums1 and one element from nums2. If nums1 = [1,7] and nums2 = [2,4], then the possible pairs are:

(1,2)
(1,4)
(7,2)
(7,4)

Their sums are:

1 + 2 = 3
1 + 4 = 5
7 + 2 = 9
7 + 4 = 11

The goal is to return the k pairs with the smallest sums, ordered from smallest to larger sums according to the extraction process.

The important detail is that both arrays are already sorted in non-decreasing order. This constraint is the key to designing an efficient solution. Since the arrays are sorted, increasing either index can only increase or maintain the pair sum. That monotonic structure allows us to avoid generating every possible pair.

The constraints are large:

Each array can contain up to 10^5 elements
k can be up to 10^4
Total possible pairs can be as large as 10^10

Because the Cartesian product of the arrays can be enormous, any solution that explicitly generates all pairs is impossible within time and memory limits.

Several edge cases are important:

Arrays may contain duplicate values, so duplicate pairs are allowed and expected.
Values may be negative, meaning the smallest sums are not necessarily formed from the smallest positive numbers.
k may be larger than the size of one array.
One array may contain only one element.
The arrays are guaranteed to be non-empty.

The problem guarantees that k <= nums1.length * nums2.length, so there are always enough pairs to return.

Approaches

Brute Force Approach

The most straightforward solution is to generate every possible pair.

For every element in nums1, we pair it with every element in nums2. This creates m * n pairs, where:

m = len(nums1)
n = len(nums2)

For each pair, we compute its sum and store it. After generating all pairs, we sort them by sum and return the first k.

This approach is correct because it explicitly examines every possible pair, so the smallest k pairs are guaranteed to be present.

However, this solution is far too slow and memory intensive for the given constraints. If both arrays contain 10^5 elements, the number of pairs becomes 10^10, which is completely infeasible.

Optimal Heap-Based Approach

The key insight comes from the fact that both arrays are sorted.

Suppose we fix an element nums1[i]. Then the pair sums:

nums1[i] + nums2[0]
nums1[i] + nums2[1]
nums1[i] + nums2[2]
...

form a sorted sequence because nums2 is sorted.

This means each row of pair sums behaves like a sorted list.

Instead of generating all pairs, we can treat the problem similarly to merging multiple sorted lists. We use a min-heap to always retrieve the smallest available pair.

We initially insert the smallest pair from each row:

(nums1[i], nums2[0])

Then, whenever we pop a pair (i, j) from the heap, we push the next pair in that row:

(i, j + 1)

This works because:

The next candidate in the same row is the next larger sum for that fixed nums1[i]
The heap always gives us the globally smallest remaining pair

Since we only ever process up to k pairs, the algorithm becomes efficient enough for the constraints.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(m * n * log(m * n))	O(m * n)	Generates and sorts every pair
Optimal	O(k log min(m, k))	O(min(m, k))	Uses heap to explore only needed pairs

Algorithm Walkthrough

Optimal Heap-Based Algorithm

Initialize a min-heap.

Each heap entry stores:

the pair sum
index i in nums1
index j in nums2

We use indices instead of storing values directly because indices let us efficiently generate neighboring pairs. 2. Insert the first pair from each row.

For every i from 0 to min(len(nums1), k) - 1, push:

(nums1[i] + nums2[0], i, 0)

We only need at most k rows because we will never return more than k pairs. 3. Repeatedly extract the smallest pair from the heap.

The heap always contains the smallest unprocessed candidate from each active row.

Pop the smallest entry:

(sum, i, j)

Add the pair:

[nums1[i], nums2[j]]

to the answer. 4. Push the next pair from the same row.

If j + 1 < len(nums2), then push:

(nums1[i] + nums2[j + 1], i, j + 1)

This advances horizontally in the row while preserving sorted order. 5. Continue until either:

we collect k pairs, or
the heap becomes empty

Why it works

Each row of pairs:

(nums1[i], nums2[0])
(nums1[i], nums2[1])
(nums1[i], nums2[2])
...

is sorted by increasing sum because nums2 is sorted.

The heap always stores the smallest unprocessed element from each row. Therefore, removing the heap minimum gives the globally smallest remaining pair.

When we pop (i, j), the next candidate from that row is (i, j + 1). Since all earlier pairs in that row were already processed, inserting only the next element is sufficient to maintain correctness.

This is essentially the same principle used in merging sorted lists with a priority queue.

Python Solution

from typing import List
import heapq

class Solution:
    def kSmallestPairs(self, nums1: List[int], nums2: List[int], k: int) -> List[List[int]]:
        if not nums1 or not nums2 or k == 0:
            return []

        min_heap = []
        result = []

        # Initialize heap with first element from each row
        for i in range(min(len(nums1), k)):
            heapq.heappush(
                min_heap,
                (nums1[i] + nums2[0], i, 0)
            )

        while min_heap and len(result) < k:
            current_sum, i, j = heapq.heappop(min_heap)

            result.append([nums1[i], nums2[j]])

            # Push next pair in the same row
            if j + 1 < len(nums2):
                heapq.heappush(
                    min_heap,
                    (
                        nums1[i] + nums2[j + 1],
                        i,
                        j + 1
                    )
                )

        return result

The implementation begins by handling trivial edge cases. If either array is empty or k is zero, no pairs can be formed.

The heap stores tuples in the form:

(sum, i, j)

Python's heapq automatically orders tuples lexicographically, so the pair with the smallest sum is always popped first.

The initialization phase inserts the first pair from each row. Since each row is sorted, the first element is the smallest candidate from that row.

The main loop repeatedly extracts the smallest pair and appends it to the result.

After removing (i, j), the algorithm inserts (i, j + 1) if it exists. This advances within the row while preserving sorted exploration.

The algorithm stops once k pairs are collected.

Go Solution

package main

import "container/heap"

type Pair struct {
	sum int
	i   int
	j   int
}

type MinHeap []Pair

func (h MinHeap) Len() int {
	return len(h)
}

func (h MinHeap) Less(i, j int) bool {
	return h[i].sum < h[j].sum
}

func (h MinHeap) Swap(i, j int) {
	h[i], h[j] = h[j], h[i]
}

func (h *MinHeap) Push(x interface{}) {
	*h = append(*h, x.(Pair))
}

func (h *MinHeap) Pop() interface{} {
	old := *h
	n := len(old)
	item := old[n-1]
	*h = old[:n-1]
	return item
}

func kSmallestPairs(nums1 []int, nums2 []int, k int) [][]int {
	if len(nums1) == 0 || len(nums2) == 0 || k == 0 {
		return [][]int{}
	}

	h := &MinHeap{}
	heap.Init(h)

	limit := min(len(nums1), k)

	for i := 0; i < limit; i++ {
		heap.Push(h, Pair{
			sum: nums1[i] + nums2[0],
			i:   i,
			j:   0,
		})
	}

	result := [][]int{}

	for h.Len() > 0 && len(result) < k {
		current := heap.Pop(h).(Pair)

		result = append(result, []int{
			nums1[current.i],
			nums2[current.j],
		})

		if current.j+1 < len(nums2) {
			heap.Push(h, Pair{
				sum: nums1[current.i] + nums2[current.j+1],
				i:   current.i,
				j:   current.j + 1,
			})
		}
	}

	return result
}

func min(a, b int) int {
	if a < b {
		return a
	}
	return b
}

The Go solution follows the same algorithmic structure as the Python version, but Go requires explicit heap implementation using the container/heap package.

A custom Pair struct stores the sum and indices.

Unlike Python, Go does not provide a built-in min-heap for arbitrary structs, so we implement the required heap interface methods:

Len
Less
Swap
Push
Pop

The rest of the logic remains identical.

Since constraints fit within 32-bit integer ranges but sums may exceed them temporarily, Go's int type is sufficient on LeetCode's environment.

Worked Examples

Example 1

nums1 = [1,7,11]
nums2 = [2,4,6]
k = 3

Initial Heap

We insert:

Pair	Sum	Heap Entry
(1,2)	3	(3,0,0)
(7,2)	9	(9,1,0)
(11,2)	13	(13,2,0)

Heap contains:

[(3,0,0), (9,1,0), (13,2,0)]

Iteration 1

Pop:

(3,0,0)

Result becomes:

[[1,2]]

Push next in same row:

(1,4)

Heap:

Heap Contents
(5,0,1)
(13,2,0)
(9,1,0)

Iteration 2

Pop:

(5,0,1)

Result:

[[1,2],[1,4]]

Push:

(1,6)

Heap:

Heap Contents
(7,0,2)
(13,2,0)
(9,1,0)

Iteration 3

Pop:

(7,0,2)

Result:

[[1,2],[1,4],[1,6]]

We now have k = 3 pairs, so we stop.

Example 2

nums1 = [1,1,2]
nums2 = [1,2,3]
k = 2

Initial Heap

Pair	Sum
(1,1) from row 0	2
(1,1) from row 1	2
(2,1)	3

Heap:

[(2,0,0), (2,1,0), (3,2,0)]

Iteration 1

Pop:

(2,0,0)

Result:

[[1,1]]

Push:

(1,2)

Heap:

[(2,1,0), (3,2,0), (3,0,1)]

Iteration 2

Pop:

(2,1,0)

Result:

[[1,1],[1,1]]

Stop because we collected k = 2 pairs.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(k log min(m, k))	Each heap operation costs logarithmic time
Space	O(min(m, k))	Heap stores at most one active entry per row

Let:

m = len(nums1)
n = len(nums2)

The heap initially contains at most min(m, k) entries.

We perform at most k heap removals and at most k insertions. Each heap operation costs:

O(log min(m, k))

Therefore the total runtime becomes:

O(k log min(m, k))

This is dramatically better than generating all m * n pairs.

Test Cases

from typing import List
import heapq

class Solution:
    def kSmallestPairs(self, nums1: List[int], nums2: List[int], k: int) -> List[List[int]]:
        if not nums1 or not nums2 or k == 0:
            return []

        min_heap = []
        result = []

        for i in range(min(len(nums1), k)):
            heapq.heappush(
                min_heap,
                (nums1[i] + nums2[0], i, 0)
            )

        while min_heap and len(result) < k:
            _, i, j = heapq.heappop(min_heap)

            result.append([nums1[i], nums2[j]])

            if j + 1 < len(nums2):
                heapq.heappush(
                    min_heap,
                    (nums1[i] + nums2[j + 1], i, j + 1)
                )

        return result

solution = Solution()

# Provided example 1
assert solution.kSmallestPairs(
    [1,7,11],
    [2,4,6],
    3
) == [[1,2],[1,4],[1,6]]

# Provided example 2 with duplicates
assert solution.kSmallestPairs(
    [1,1,2],
    [1,2,3],
    2
) == [[1,1],[1,1]]

# Single element arrays
assert solution.kSmallestPairs(
    [1],
    [2],
    1
) == [[1,2]]

# Negative numbers
assert solution.kSmallestPairs(
    [-2,-1],
    [-3],
    2
) == [[-2,-3],[-1,-3]]

# k equals total number of pairs
assert solution.kSmallestPairs(
    [1,2],
    [3,4],
    4
) == [[1,3],[1,4],[2,3],[2,4]]

# Large k but small arrays
assert solution.kSmallestPairs(
    [1],
    [1,2,3],
    3
) == [[1,1],[1,2],[1,3]]

# Duplicate-heavy input
assert solution.kSmallestPairs(
    [1,1,1],
    [1,1],
    4
) == [[1,1],[1,1],[1,1],[1,1]]

# Arrays with zeros
assert solution.kSmallestPairs(
    [0,0],
    [0,0],
    3
) == [[0,0],[0,0],[0,0]]

# Mixed positive and negative values
assert solution.kSmallestPairs(
    [-1,0,1],
    [-2,2],
    4
) == [[-1,-2],[0,-2],[1,-2],[-1,2]]

Test Summary

Test	Why
Standard example	Verifies normal heap behavior
Duplicate values	Ensures duplicates are handled correctly
Single-element arrays	Smallest valid input
Negative numbers	Confirms sorting logic works with negatives
k equals total pairs	Ensures full traversal works
One short array	Tests row expansion logic
Many duplicates	Validates duplicate pair generation
Zero values	Confirms handling of equal sums
Mixed signs	Tests ordering across negative and positive sums

Edge Cases

Duplicate Values in Arrays

Arrays may contain repeated numbers, which means identical pairs can appear multiple times. A common mistake is trying to eliminate duplicates with a set, which would produce incorrect results.

For example:

nums1 = [1,1]
nums2 = [1]

The correct answer for k = 2 is:

[[1,1],[1,1]]

Our implementation correctly treats pairs from different indices as distinct heap entries.

Very Large Arrays

The arrays may each contain up to 10^5 elements. Generating all possible pairs would require enormous memory and runtime.

The heap-based approach avoids this by only exploring pairs that could realistically appear among the smallest k. At any moment, the heap stores at most min(len(nums1), k) entries.

One Array Much Smaller Than the Other

Consider:

nums1 = [1]
nums2 = [1,2,3,4,5]

Only one row of pairs exists.

The algorithm still works naturally because the heap starts with a single entry and repeatedly advances along that row.

Negative Numbers

Negative values can produce smaller sums than positive values, so assuming positivity would break correctness.

For example:

nums1 = [-10, 1]
nums2 = [-5, 2]

The smallest pair is:

(-10, -5)

The algorithm remains correct because it relies only on sorted order, not on positivity.