LeetCode 1570 - Dot Product of Two Sparse Vectors
This problem asks us to efficiently compute the dot product between two sparse vectors. A normal vector is simply an array of numbers. The dot product of two vectors is computed by multiplying corresponding elements and summing the results.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, Two Pointers, Design
Solution
Problem Understanding
This problem asks us to efficiently compute the dot product between two sparse vectors.
A normal vector is simply an array of numbers. The dot product of two vectors is computed by multiplying corresponding elements and summing the results. For two vectors a and b of the same length:
$$a \cdot b = \sum_{i=0}^{n-1} a[i] \times b[i]$$
The important detail here is that the vectors are sparse. A sparse vector contains mostly zero values. Since multiplying by zero contributes nothing to the final sum, storing every element explicitly wastes both memory and computation time.
The goal is therefore not only to compute the correct dot product, but also to design the data structure efficiently for sparse data.
The class must support two operations:
- Constructing a sparse vector from an input array.
- Computing the dot product between two sparse vectors.
The constraints tell us that the vector length can be as large as 10^5. A brute-force scan across every index is still technically feasible for a single operation, but the whole point of the problem is to optimize for sparsity. If most entries are zero, we should avoid storing or processing them.
The input guarantees that both vectors have the same length and that all values are non-negative integers.
Several edge cases are important:
- Vectors may contain all zeros.
- Only a few positions may be non-zero.
- The non-zero positions may not overlap at all.
- One vector may be much sparser than the other.
- Large vector sizes require memory-efficient storage.
A naive implementation that stores all elements wastes space and ignores the key property of sparsity.
Approaches
Brute Force Approach
The simplest solution is to store the entire array and compute the dot product by iterating through every index.
For each position i, multiply nums1[i] * nums2[i] and add it to the answer.
This approach is straightforward and always correct because it directly follows the mathematical definition of the dot product.
However, this method does unnecessary work when most entries are zero. Sparse vectors may contain only a handful of non-zero elements, yet the brute-force solution still processes all n indices.
It also wastes memory because zeros are stored explicitly even though they contribute nothing to the computation.
Optimal Approach
The key observation is that only indices with non-zero values matter.
Instead of storing the entire array, we store only:
- the index
- the non-zero value at that index
For example:
[0, 0, 5, 0, 2]
can be stored as:
[(2, 5), (4, 2)]
Now, when computing the dot product, we only need to multiply values whose indices match.
There are two common ways to do this:
- Hash map lookup
- Two pointers over sorted index lists
Since indices naturally appear in sorted order during construction, the two-pointer technique is very efficient.
We store non-zero elements as ordered (index, value) pairs. During the dot product computation:
- If both pointers point to the same index, multiply the values and move both pointers.
- Otherwise, advance the pointer with the smaller index.
This avoids processing zero entries entirely.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) | O(n) | Stores all elements and scans entire vector |
| Optimal | O(k1 + k2) | O(k) | Stores only non-zero values, where k is number of non-zero entries |
Here:
nis the vector lengthk1andk2are the number of non-zero entries in each vector
For sparse data, k1 and k2 are usually much smaller than n.
Algorithm Walkthrough
Optimal Algorithm
- During vector construction, iterate through the input array.
- For every non-zero value, store a pair
(index, value)in a list. - Skip all zero values entirely because they do not affect the dot product.
- To compute the dot product, initialize two pointers:
ifor the first vectorjfor the second vector
- Compare the indices at both pointers.
- If the indices are equal:
- Multiply the corresponding values
- Add the product to the answer
- Move both pointers forward
- If the first index is smaller:
- Move pointer
iforward - This works because indices are sorted, so the smaller index cannot appear later in the second vector
- If the second index is smaller:
- Move pointer
jforward
- Continue until one pointer reaches the end of its list.
- Return the accumulated sum.
Why it works
The algorithm works because the dot product only depends on matching indices.
By storing only non-zero entries, we eliminate unnecessary operations involving zeros. Since the stored indices are sorted, the two-pointer technique guarantees that every matching index is found exactly once.
No valid contribution is skipped, and no incorrect multiplication is performed.
Python Solution
from typing import List
class SparseVector:
def __init__(self, nums: List[int]):
self.values = []
for index, value in enumerate(nums):
if value != 0:
self.values.append((index, value))
# Return the dotProduct of two sparse vectors
def dotProduct(self, vec: 'SparseVector') -> int:
pointer1 = 0
pointer2 = 0
result = 0
while pointer1 < len(self.values) and pointer2 < len(vec.values):
index1, value1 = self.values[pointer1]
index2, value2 = vec.values[pointer2]
if index1 == index2:
result += value1 * value2
pointer1 += 1
pointer2 += 1
elif index1 < index2:
pointer1 += 1
else:
pointer2 += 1
return result
# Your SparseVector object will be instantiated and called as such:
# v1 = SparseVector(nums1)
# v2 = SparseVector(nums2)
# ans = v1.dotProduct(v2)
The constructor scans the input array once and stores only non-zero entries. Each stored element is represented as a tuple containing its index and value.
The dotProduct method uses two pointers because both lists are naturally sorted by index. This allows efficient matching without requiring additional hash maps.
Whenever the indices match, the corresponding values contribute to the final dot product. If the indices differ, the smaller index is advanced because it cannot match any future larger index in the other vector.
This implementation avoids unnecessary processing of zero values and is highly efficient for sparse data.
Go Solution
type Pair struct {
index int
value int
}
type SparseVector struct {
values []Pair
}
func Constructor(nums []int) SparseVector {
vector := SparseVector{
values: []Pair{},
}
for index, value := range nums {
if value != 0 {
vector.values = append(vector.values, Pair{
index: index,
value: value,
})
}
}
return vector
}
// Return the dotProduct of two sparse vectors
func (this *SparseVector) dotProduct(vec SparseVector) int {
pointer1 := 0
pointer2 := 0
result := 0
for pointer1 < len(this.values) && pointer2 < len(vec.values) {
pair1 := this.values[pointer1]
pair2 := vec.values[pointer2]
if pair1.index == pair2.index {
result += pair1.value * pair2.value
pointer1++
pointer2++
} else if pair1.index < pair2.index {
pointer1++
} else {
pointer2++
}
}
return result
}
/**
* Your SparseVector object will be instantiated and called as such:
* v1 := Constructor(nums1)
* v2 := Constructor(nums2)
* ans := v1.dotProduct(v2)
*/
The Go implementation follows the same logic as the Python version. Since Go does not have tuples, a custom Pair struct is used to store index-value pairs.
Slices are used to store the non-zero entries efficiently. The dot product computation again relies on the two-pointer technique.
Integer overflow is not an issue under the given constraints because the maximum possible result fits comfortably inside Go's int type.
Worked Examples
Example 1
nums1 = [1,0,0,2,3]
nums2 = [0,3,0,4,0]
After construction:
v1.values = [(0,1), (3,2), (4,3)]
v2.values = [(1,3), (3,4)]
| Step | pointer1 | pointer2 | Pair1 | Pair2 | Action | Result |
|---|---|---|---|---|---|---|
| 1 | 0 | 0 | (0,1) | (1,3) | 0 < 1, move pointer1 | 0 |
| 2 | 1 | 0 | (3,2) | (1,3) | 1 < 3, move pointer2 | 0 |
| 3 | 1 | 1 | (3,2) | (3,4) | indices match, add 2×4 | 8 |
Final answer:
8
Example 2
nums1 = [0,1,0,0,0]
nums2 = [0,0,0,0,2]
Stored representations:
v1.values = [(1,1)]
v2.values = [(4,2)]
| Step | pointer1 | pointer2 | Pair1 | Pair2 | Action | Result |
|---|---|---|---|---|---|---|
| 1 | 0 | 0 | (1,1) | (4,2) | 1 < 4, move pointer1 | 0 |
Loop ends because pointer1 reaches the end.
Final answer:
0
Example 3
nums1 = [0,1,0,0,2,0,0]
nums2 = [1,0,0,0,3,0,4]
Stored representations:
v1.values = [(1,1), (4,2)]
v2.values = [(0,1), (4,3), (6,4)]
| Step | pointer1 | pointer2 | Pair1 | Pair2 | Action | Result |
|---|---|---|---|---|---|---|
| 1 | 0 | 0 | (1,1) | (0,1) | 0 < 1, move pointer2 | 0 |
| 2 | 0 | 1 | (1,1) | (4,3) | 1 < 4, move pointer1 | 0 |
| 3 | 1 | 1 | (4,2) | (4,3) | indices match, add 2×3 | 6 |
Final answer:
6
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(k1 + k2) | Each non-zero entry is processed at most once |
| Space | O(k) | Only non-zero values are stored |
The constructor runs in O(n) time because it scans the input once. However, the actual sparse representation only stores non-zero values.
The dot product computation is very efficient because each pointer moves forward at most once per stored entry. No unnecessary work is done for zero values.
This makes the solution significantly better than a dense-vector approach when the vectors are sparse.
Test Cases
from typing import List
class SparseVector:
def __init__(self, nums: List[int]):
self.values = []
for index, value in enumerate(nums):
if value != 0:
self.values.append((index, value))
def dotProduct(self, vec: 'SparseVector') -> int:
pointer1 = 0
pointer2 = 0
result = 0
while pointer1 < len(self.values) and pointer2 < len(vec.values):
index1, value1 = self.values[pointer1]
index2, value2 = vec.values[pointer2]
if index1 == index2:
result += value1 * value2
pointer1 += 1
pointer2 += 1
elif index1 < index2:
pointer1 += 1
else:
pointer2 += 1
return result
# Example 1
v1 = SparseVector([1, 0, 0, 2, 3])
v2 = SparseVector([0, 3, 0, 4, 0])
assert v1.dotProduct(v2) == 8 # matching non-zero index at 3
# Example 2
v1 = SparseVector([0, 1, 0, 0, 0])
v2 = SparseVector([0, 0, 0, 0, 2])
assert v1.dotProduct(v2) == 0 # no overlapping non-zero indices
# Example 3
v1 = SparseVector([0, 1, 0, 0, 2, 0, 0])
v2 = SparseVector([1, 0, 0, 0, 3, 0, 4])
assert v1.dotProduct(v2) == 6 # one overlapping non-zero index
# Both vectors all zeros
v1 = SparseVector([0, 0, 0])
v2 = SparseVector([0, 0, 0])
assert v1.dotProduct(v2) == 0 # completely empty sparse representation
# Single element vectors
v1 = SparseVector([5])
v2 = SparseVector([7])
assert v1.dotProduct(v2) == 35 # smallest non-trivial input
# One sparse, one dense
v1 = SparseVector([0, 0, 10, 0, 0])
v2 = SparseVector([1, 2, 3, 4, 5])
assert v1.dotProduct(v2) == 30 # only one matching non-zero index
# Multiple matching indices
v1 = SparseVector([1, 2, 0, 3])
v2 = SparseVector([4, 5, 0, 6])
assert v1.dotProduct(v2) == 32 # 1*4 + 2*5 + 3*6
# Large gaps between non-zero values
v1 = SparseVector([1, 0, 0, 0, 0, 2])
v2 = SparseVector([0, 0, 0, 0, 0, 3])
assert v1.dotProduct(v2) == 6 # matching only at final index
| Test | Why |
|---|---|
| Example 1 | Validates normal sparse overlap |
| Example 2 | Validates no overlap case |
| Example 3 | Validates sparse traversal logic |
| Both vectors all zeros | Ensures empty sparse storage works |
| Single element vectors | Tests smallest meaningful input |
| One sparse, one dense | Tests follow-up scenario |
| Multiple matching indices | Ensures repeated matches accumulate correctly |
| Large gaps between non-zero values | Verifies pointer movement across distant indices |
Edge Cases
Both vectors contain only zeros
A common source of bugs is handling vectors where no non-zero elements exist. In this situation, the sparse representation becomes an empty list.
If the implementation assumes at least one stored entry exists, pointer access errors can occur. This solution handles the case naturally because the while loop immediately terminates when both lists are empty.
No overlapping non-zero indices
Two vectors may each contain non-zero values, but none of the indices match.
For example:
[1,0,0]
[0,2,0]
A buggy implementation might accidentally multiply unrelated values. The two-pointer technique avoids this by only multiplying values when the indices are identical.
One vector is much denser than the other
The follow-up question specifically asks about the scenario where only one vector is sparse.
For example:
[0,0,0,5,0]
[1,2,3,4,5]
The current implementation still performs efficiently because it only iterates through the stored non-zero entries of the sparse vector. Even if the other vector contains many non-zero values, the algorithm remains efficient compared to scanning every index blindly.