LeetCode 2615 - Sum of Distances
The problem asks us to compute, for every index i in the array nums, the total distance between i and every other index j where nums[j] == nums[i]. More formally, for each position i, we need to calculate: for all indices j such that: - nums[j] == nums[i] - j !
Difficulty: 🟡 Medium
Topics: Array, Hash Table, Prefix Sum
Solution
Problem Understanding
The problem asks us to compute, for every index i in the array nums, the total distance between i and every other index j where nums[j] == nums[i].
More formally, for each position i, we need to calculate:
$$\sum |i - j|$$
for all indices j such that:
nums[j] == nums[i]j != i
If there are no other matching values, then the answer for that index is simply 0.
In simpler terms, imagine grouping together all indices that contain the same value. For each index in a group, we want to know how far away all the other matching indices are, and sum those distances.
For example, consider:
nums = [1,3,1,1,2]
The value 1 appears at indices [0,2,3].
For index 0, the distance sum is:
|0 - 2| + |0 - 3| = 5
For index 2:
|2 - 0| + |2 - 3| = 3
For index 3:
|3 - 0| + |3 - 2| = 4
Values 3 and 2 appear only once, so their distances are 0.
The constraints are important:
1 <= nums.length <= 10^50 <= nums[i] <= 10^9
The array can contain up to one hundred thousand elements, which immediately rules out expensive quadratic solutions. Since values can be as large as 10^9, we cannot rely on a frequency array indexed by value. Instead, we need a hash map to group identical numbers efficiently.
The main challenge is avoiding repeated distance calculations. A naive implementation would compare every index with every other index, which becomes too slow for large inputs.
Several edge cases are worth identifying early:
- Arrays where every value is unique, every result should be
0. - Arrays where all values are identical, every index must compute distances to all other positions.
- Large repeated groups, where naive repeated summation becomes prohibitively expensive.
- Single-element arrays, where the answer must always be
[0].
Approaches
Brute Force Approach
The most direct way to solve this problem is to iterate through every index i and compare it with every other index j.
For each pair:
- If
nums[i] == nums[j] - And
i != j
we add |i - j| to arr[i].
This approach is correct because it explicitly checks every valid pair and accumulates the required distances.
However, it is extremely inefficient. Since we compare every index with every other index, the time complexity becomes O(n²).
With n = 10^5, this could require up to:
10^10 operations
which is far too slow.
Optimal Approach, Grouping + Prefix Sum Insight
The key observation is that only indices with the same value matter.
Instead of repeatedly scanning the whole array, we first group indices by value.
For example:
nums = [1,3,1,1,2]
groups:
1 -> [0,2,3]
3 -> [1]
2 -> [4]
Now consider a group of indices:
[0,2,3]
For index 2, we want:
(2 - 0) + (3 - 2)
Instead of recomputing every distance separately, we can use prefix sums.
Suppose:
positions = [0,2,3]
total_sum = 5
For position pos at index i:
Left contribution:
i * pos - left_sum
Right contribution:
right_sum - (count_right * pos)
This works because:
- All left positions are smaller
- All right positions are larger
Since positions are naturally sorted by index, absolute values disappear into simple subtraction.
This lets us compute each group's answers in linear time.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Compare every pair of indices |
| Optimal | O(n) | O(n) | Group indices and use prefix sums |
Algorithm Walkthrough
- Create a hash map where each number maps to a list of its indices.
For example:
nums = [1,3,1,1,2]
map:
1 -> [0,2,3]
3 -> [1]
2 -> [4]
This allows us to process only matching elements together. 2. Initialize an answer array filled with zeros.
Since indices that appear only once automatically have distance 0, this gives us the correct default value.
3. Process each group of indices independently.
Suppose we have:
positions = [0,2,3]
- Compute the total sum of positions.
For example:
total_sum = 0 + 2 + 3 = 5
We will use this to efficiently compute right-side contributions.
5. Maintain a running prefix sum called left_sum.
Initially:
left_sum = 0
- Iterate through each position in the group.
For each position pos at index i in the group:
Compute the left-side contribution:
left_distance = i * pos - left_sum
This gives the total distance to all earlier positions. 7. Compute the right-side contribution.
First calculate:
right_sum = total_sum - left_sum - pos
Then:
right_distance =
right_sum - (remaining_count * pos)
- Add both contributions.
answer[pos] =
left_distance + right_distance
- Update the prefix sum.
left_sum += pos
- Repeat for every group.
Why it works
For any group of matching indices, positions are sorted naturally because we encounter them in increasing order during the array scan.
For a given position pos:
- Every index on the left contributes
pos - left_index - Every index on the right contributes
right_index - pos
Using prefix sums lets us aggregate these distances mathematically instead of summing individually.
Because each index is processed exactly once inside its group, and every distance contribution is accounted for exactly once, the algorithm always computes the correct total distance.
Python Solution
from collections import defaultdict
from typing import List
class Solution:
def distance(self, nums: List[int]) -> List[int]:
positions_map = defaultdict(list)
# Group indices by value
for index, value in enumerate(nums):
positions_map[value].append(index)
result = [0] * len(nums)
# Process each group
for positions in positions_map.values():
total_sum = sum(positions)
left_sum = 0
group_size = len(positions)
for i, position in enumerate(positions):
left_distance = i * position - left_sum
right_sum = total_sum - left_sum - position
right_count = group_size - i - 1
right_distance = right_sum - (
right_count * position
)
result[position] = (
left_distance + right_distance
)
left_sum += position
return result
The implementation starts by building a hash map that stores all indices for each value. Since only identical values matter, this grouping step lets us avoid unnecessary comparisons.
We then create the result array filled with zeros. This naturally handles values that appear only once.
For every group of matching positions, we compute the total sum of indices. During iteration, left_sum keeps track of all previously seen positions.
At each position:
left_distancecomputes contribution from earlier indices.right_distancecomputes contribution from later indices.
By combining both, we get the total distance for that index in constant time.
Finally, we update left_sum so future calculations include the current position.
Since every index is processed exactly once after grouping, the solution runs efficiently in linear time.
Go Solution
func distance(nums []int) []int64 {
n := len(nums)
positionsMap := make(map[int][]int)
// Group indices by value
for i, value := range nums {
positionsMap[value] = append(
positionsMap[value],
i,
)
}
result := make([]int64, n)
// Process each group
for _, positions := range positionsMap {
var totalSum int64
for _, pos := range positions {
totalSum += int64(pos)
}
var leftSum int64
groupSize := len(positions)
for i, pos := range positions {
position := int64(pos)
leftDistance :=
int64(i)*position - leftSum
rightSum :=
totalSum - leftSum - position
rightCount :=
int64(groupSize - i - 1)
rightDistance :=
rightSum -
(rightCount * position)
result[pos] =
leftDistance + rightDistance
leftSum += position
}
}
return result
}
The Go implementation follows the exact same algorithm but includes some Go-specific considerations.
The return type is []int64, which is required by the problem signature. Since distance sums can become large, using int64 avoids overflow risks.
Unlike Python integers, Go integers have fixed sizes. Therefore, all arithmetic involving sums and multiplication is explicitly converted to int64.
Slices are used for grouped positions because they dynamically grow and efficiently store indices.
Worked Examples
Example 1
nums = [1,3,1,1,2]
Step 1: Build groups
| Value | Indices |
|---|---|
| 1 | [0,2,3] |
| 3 | [1] |
| 2 | [4] |
Process value 1
positions = [0,2,3]
total_sum = 5
left_sum = 0
| i | position | left_distance | right_sum | right_distance | result[position] |
|---|---|---|---|---|---|
| 0 | 0 | 0 | 5 | 5 | 5 |
| 1 | 2 | 2 | 3 | 1 | 3 |
| 2 | 3 | 4 | 0 | 0 | 4 |
Current result:
[5,0,3,4,0]
Process value 3
positions = [1]
No matching index exists.
Result remains:
[5,0,3,4,0]
Process value 2
positions = [4]
No matching index exists.
Final answer:
[5,0,3,4,0]
Example 2
nums = [0,5,3]
Step 1: Build groups
| Value | Indices |
|---|---|
| 0 | [0] |
| 5 | [1] |
| 3 | [2] |
Each value appears only once.
Therefore:
result = [0,0,0]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each index is added once to a group and processed once |
| Space | O(n) | Hash map stores all indices |
The grouping phase takes O(n) time because every element is inserted once.
The processing phase is also O(n) because every index across all groups is visited exactly once.
Even though there is a nested loop structure, the total work across all groups still sums to n, making the overall complexity linear.
The auxiliary space is O(n) because the hash map stores every index.
Test Cases
solution = Solution()
# Provided examples
assert solution.distance([1, 3, 1, 1, 2]) == [5, 0, 3, 4, 0] # repeated values
assert solution.distance([0, 5, 3]) == [0, 0, 0] # all unique
# Single element
assert solution.distance([7]) == [0] # smallest input
# All identical values
assert solution.distance([2, 2, 2]) == [3, 2, 3] # repeated everywhere
# Two identical elements
assert solution.distance([4, 4]) == [1, 1] # simple pair
# Multiple repeating groups
assert solution.distance([1, 2, 1, 2, 1]) == [6, 2, 4, 2, 6] # interleaved groups
# Large gaps
assert solution.distance([1, 0, 0, 0, 1]) == [4, 2, 2, 2, 4] # distant repeats
# All unique large input
assert solution.distance([1, 2, 3, 4, 5]) == [0, 0, 0, 0, 0] # no duplicates
# Mixed frequencies
assert solution.distance([8, 8, 3, 8, 3]) == [4, 3, 2, 5, 2] # varying counts
| Test | Why |
|---|---|
[1,3,1,1,2] |
Validates the primary example |
[0,5,3] |
Confirms all unique values |
[7] |
Smallest possible input |
[2,2,2] |
Every element repeats |
[4,4] |
Minimal duplicate case |
[1,2,1,2,1] |
Interleaved repeated groups |
[1,0,0,0,1] |
Large spacing between duplicates |
[1,2,3,4,5] |
No repeated values |
[8,8,3,8,3] |
Mixed frequency groups |
Edge Cases
Single Element Array
A single-element array is the smallest valid input:
nums = [7]
There are no other matching indices, so the result must be:
[0]
A buggy implementation might accidentally assume every group contains multiple elements. Our implementation handles this naturally because single-element groups produce zero left and right contributions.
All Elements Are Unique
Consider:
nums = [1,2,3,4]
Every value appears exactly once.
A naive grouping algorithm might still attempt unnecessary computations or incorrectly calculate distances. Since our answer array is initialized with zeros and each single-element group contributes nothing, the correct output is returned automatically.
All Elements Are Identical
Consider:
nums = [5,5,5,5]
Every index must calculate distances to every other index.
This case can easily expose performance problems in quadratic solutions because every element interacts with every other element.
Our prefix sum approach handles this efficiently in linear time, computing all distance sums without repeated pairwise comparisons.
Large Sparse Distances
Consider:
nums = [1,0,0,0,1]
Matching values may be very far apart.
A careless implementation can make mistakes when computing absolute differences across large index gaps. Our mathematical formulation using left and right contributions handles both nearby and distant positions uniformly and correctly.