LeetCode 1570 - Dot Product of Two Sparse Vectors

Difficulty: 🟡 Medium
Topics: Array, Hash Table, Two Pointers, Design

Solution

Problem Understanding

This problem asks us to efficiently compute the dot product between two sparse vectors.

A normal vector is simply an array of numbers. The dot product of two vectors is computed by multiplying corresponding elements and summing the results. For two vectors a and b of the same length:

$$a \cdot b = \sum_{i=0}^{n-1} a[i] \times b[i]$$

The important detail here is that the vectors are sparse. A sparse vector contains mostly zero values. Since multiplying by zero contributes nothing to the final sum, storing every element explicitly wastes both memory and computation time.

The goal is therefore not only to compute the correct dot product, but also to design the data structure efficiently for sparse data.

The class must support two operations:

Constructing a sparse vector from an input array.
Computing the dot product between two sparse vectors.

The constraints tell us that the vector length can be as large as 10^5. A brute-force scan across every index is still technically feasible for a single operation, but the whole point of the problem is to optimize for sparsity. If most entries are zero, we should avoid storing or processing them.

The input guarantees that both vectors have the same length and that all values are non-negative integers.

Several edge cases are important:

Vectors may contain all zeros.
Only a few positions may be non-zero.
The non-zero positions may not overlap at all.
One vector may be much sparser than the other.
Large vector sizes require memory-efficient storage.

A naive implementation that stores all elements wastes space and ignores the key property of sparsity.

Approaches

Brute Force Approach

The simplest solution is to store the entire array and compute the dot product by iterating through every index.

For each position i, multiply nums1[i] * nums2[i] and add it to the answer.

This approach is straightforward and always correct because it directly follows the mathematical definition of the dot product.

However, this method does unnecessary work when most entries are zero. Sparse vectors may contain only a handful of non-zero elements, yet the brute-force solution still processes all n indices.

It also wastes memory because zeros are stored explicitly even though they contribute nothing to the computation.

Optimal Approach

The key observation is that only indices with non-zero values matter.

Instead of storing the entire array, we store only:

the index
the non-zero value at that index

For example:

[0, 0, 5, 0, 2]

can be stored as:

[(2, 5), (4, 2)]

Now, when computing the dot product, we only need to multiply values whose indices match.

There are two common ways to do this:

Hash map lookup
Two pointers over sorted index lists

Since indices naturally appear in sorted order during construction, the two-pointer technique is very efficient.

We store non-zero elements as ordered (index, value) pairs. During the dot product computation:

If both pointers point to the same index, multiply the values and move both pointers.
Otherwise, advance the pointer with the smaller index.

This avoids processing zero entries entirely.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n)	O(n)	Stores all elements and scans entire vector
Optimal	O(k1 + k2)	O(k)	Stores only non-zero values, where `k` is number of non-zero entries

Here:

n is the vector length
k1 and k2 are the number of non-zero entries in each vector

For sparse data, k1 and k2 are usually much smaller than n.

Algorithm Walkthrough

Optimal Algorithm

During vector construction, iterate through the input array.
For every non-zero value, store a pair (index, value) in a list.
Skip all zero values entirely because they do not affect the dot product.
To compute the dot product, initialize two pointers:

i for the first vector
j for the second vector

Compare the indices at both pointers.
If the indices are equal:

Multiply the corresponding values
Add the product to the answer
Move both pointers forward

If the first index is smaller:

Move pointer i forward
This works because indices are sorted, so the smaller index cannot appear later in the second vector

If the second index is smaller:

Move pointer j forward

Continue until one pointer reaches the end of its list.
Return the accumulated sum.

Why it works

The algorithm works because the dot product only depends on matching indices.

By storing only non-zero entries, we eliminate unnecessary operations involving zeros. Since the stored indices are sorted, the two-pointer technique guarantees that every matching index is found exactly once.

No valid contribution is skipped, and no incorrect multiplication is performed.

Python Solution

from typing import List

class SparseVector:
    def __init__(self, nums: List[int]):
        self.values = []

        for index, value in enumerate(nums):
            if value != 0:
                self.values.append((index, value))

    # Return the dotProduct of two sparse vectors
    def dotProduct(self, vec: 'SparseVector') -> int:
        pointer1 = 0
        pointer2 = 0
        result = 0

        while pointer1 < len(self.values) and pointer2 < len(vec.values):
            index1, value1 = self.values[pointer1]
            index2, value2 = vec.values[pointer2]

            if index1 == index2:
                result += value1 * value2
                pointer1 += 1
                pointer2 += 1
            elif index1 < index2:
                pointer1 += 1
            else:
                pointer2 += 1

        return result

# Your SparseVector object will be instantiated and called as such:
# v1 = SparseVector(nums1)
# v2 = SparseVector(nums2)
# ans = v1.dotProduct(v2)

The constructor scans the input array once and stores only non-zero entries. Each stored element is represented as a tuple containing its index and value.

The dotProduct method uses two pointers because both lists are naturally sorted by index. This allows efficient matching without requiring additional hash maps.

Whenever the indices match, the corresponding values contribute to the final dot product. If the indices differ, the smaller index is advanced because it cannot match any future larger index in the other vector.

This implementation avoids unnecessary processing of zero values and is highly efficient for sparse data.

Go Solution

type Pair struct {
	index int
	value int
}

type SparseVector struct {
	values []Pair
}

func Constructor(nums []int) SparseVector {
	vector := SparseVector{
		values: []Pair{},
	}

	for index, value := range nums {
		if value != 0 {
			vector.values = append(vector.values, Pair{
				index: index,
				value: value,
			})
		}
	}

	return vector
}

// Return the dotProduct of two sparse vectors
func (this *SparseVector) dotProduct(vec SparseVector) int {
	pointer1 := 0
	pointer2 := 0
	result := 0

	for pointer1 < len(this.values) && pointer2 < len(vec.values) {
		pair1 := this.values[pointer1]
		pair2 := vec.values[pointer2]

		if pair1.index == pair2.index {
			result += pair1.value * pair2.value
			pointer1++
			pointer2++
		} else if pair1.index < pair2.index {
			pointer1++
		} else {
			pointer2++
		}
	}

	return result
}

/**
 * Your SparseVector object will be instantiated and called as such:
 * v1 := Constructor(nums1)
 * v2 := Constructor(nums2)
 * ans := v1.dotProduct(v2)
 */

The Go implementation follows the same logic as the Python version. Since Go does not have tuples, a custom Pair struct is used to store index-value pairs.

Slices are used to store the non-zero entries efficiently. The dot product computation again relies on the two-pointer technique.

Integer overflow is not an issue under the given constraints because the maximum possible result fits comfortably inside Go's int type.

Worked Examples

Example 1

nums1 = [1,0,0,2,3]
nums2 = [0,3,0,4,0]

After construction:

v1.values = [(0,1), (3,2), (4,3)]
v2.values = [(1,3), (3,4)]

Step	pointer1	pointer2	Pair1	Pair2	Action	Result
1	0	0	(0,1)	(1,3)	0 < 1, move pointer1	0
2	1	0	(3,2)	(1,3)	1 < 3, move pointer2	0
3	1	1	(3,2)	(3,4)	indices match, add 2×4	8

Final answer:

Example 2

nums1 = [0,1,0,0,0]
nums2 = [0,0,0,0,2]

Stored representations:

v1.values = [(1,1)]
v2.values = [(4,2)]

Step	pointer1	pointer2	Pair1	Pair2	Action	Result
1	0	0	(1,1)	(4,2)	1 < 4, move pointer1	0

Loop ends because pointer1 reaches the end.

Final answer:

Example 3

nums1 = [0,1,0,0,2,0,0]
nums2 = [1,0,0,0,3,0,4]

Stored representations:

v1.values = [(1,1), (4,2)]
v2.values = [(0,1), (4,3), (6,4)]

Step	pointer1	pointer2	Pair1	Pair2	Action	Result
1	0	0	(1,1)	(0,1)	0 < 1, move pointer2	0
2	0	1	(1,1)	(4,3)	1 < 4, move pointer1	0
3	1	1	(4,2)	(4,3)	indices match, add 2×3	6

Final answer:

Complexity Analysis

Measure	Complexity	Explanation
Time	O(k1 + k2)	Each non-zero entry is processed at most once
Space	O(k)	Only non-zero values are stored

The constructor runs in O(n) time because it scans the input once. However, the actual sparse representation only stores non-zero values.

The dot product computation is very efficient because each pointer moves forward at most once per stored entry. No unnecessary work is done for zero values.

This makes the solution significantly better than a dense-vector approach when the vectors are sparse.

Test Cases

from typing import List

class SparseVector:
    def __init__(self, nums: List[int]):
        self.values = []

        for index, value in enumerate(nums):
            if value != 0:
                self.values.append((index, value))

    def dotProduct(self, vec: 'SparseVector') -> int:
        pointer1 = 0
        pointer2 = 0
        result = 0

        while pointer1 < len(self.values) and pointer2 < len(vec.values):
            index1, value1 = self.values[pointer1]
            index2, value2 = vec.values[pointer2]

            if index1 == index2:
                result += value1 * value2
                pointer1 += 1
                pointer2 += 1
            elif index1 < index2:
                pointer1 += 1
            else:
                pointer2 += 1

        return result

# Example 1
v1 = SparseVector([1, 0, 0, 2, 3])
v2 = SparseVector([0, 3, 0, 4, 0])
assert v1.dotProduct(v2) == 8  # matching non-zero index at 3

# Example 2
v1 = SparseVector([0, 1, 0, 0, 0])
v2 = SparseVector([0, 0, 0, 0, 2])
assert v1.dotProduct(v2) == 0  # no overlapping non-zero indices

# Example 3
v1 = SparseVector([0, 1, 0, 0, 2, 0, 0])
v2 = SparseVector([1, 0, 0, 0, 3, 0, 4])
assert v1.dotProduct(v2) == 6  # one overlapping non-zero index

# Both vectors all zeros
v1 = SparseVector([0, 0, 0])
v2 = SparseVector([0, 0, 0])
assert v1.dotProduct(v2) == 0  # completely empty sparse representation

# Single element vectors
v1 = SparseVector([5])
v2 = SparseVector([7])
assert v1.dotProduct(v2) == 35  # smallest non-trivial input

# One sparse, one dense
v1 = SparseVector([0, 0, 10, 0, 0])
v2 = SparseVector([1, 2, 3, 4, 5])
assert v1.dotProduct(v2) == 30  # only one matching non-zero index

# Multiple matching indices
v1 = SparseVector([1, 2, 0, 3])
v2 = SparseVector([4, 5, 0, 6])
assert v1.dotProduct(v2) == 32  # 1*4 + 2*5 + 3*6

# Large gaps between non-zero values
v1 = SparseVector([1, 0, 0, 0, 0, 2])
v2 = SparseVector([0, 0, 0, 0, 0, 3])
assert v1.dotProduct(v2) == 6  # matching only at final index

Test	Why
Example 1	Validates normal sparse overlap
Example 2	Validates no overlap case
Example 3	Validates sparse traversal logic
Both vectors all zeros	Ensures empty sparse storage works
Single element vectors	Tests smallest meaningful input
One sparse, one dense	Tests follow-up scenario
Multiple matching indices	Ensures repeated matches accumulate correctly
Large gaps between non-zero values	Verifies pointer movement across distant indices

Edge Cases

Both vectors contain only zeros

A common source of bugs is handling vectors where no non-zero elements exist. In this situation, the sparse representation becomes an empty list.

If the implementation assumes at least one stored entry exists, pointer access errors can occur. This solution handles the case naturally because the while loop immediately terminates when both lists are empty.

No overlapping non-zero indices

Two vectors may each contain non-zero values, but none of the indices match.

For example:

[1,0,0]
[0,2,0]

A buggy implementation might accidentally multiply unrelated values. The two-pointer technique avoids this by only multiplying values when the indices are identical.

One vector is much denser than the other

The follow-up question specifically asks about the scenario where only one vector is sparse.

For example:

[0,0,0,5,0]
[1,2,3,4,5]

The current implementation still performs efficiently because it only iterates through the stored non-zero entries of the sparse vector. Even if the other vector contains many non-zero values, the algorithm remains efficient compared to scanning every index blindly.