LeetCode 477 - Total Hamming Distance
The problem asks us to compute the total Hamming distance between every possible pair of integers in the array. The Hamming distance between two integers is defined as the number of bit positions where the two numbers differ.
Difficulty: 🟡 Medium
Topics: Array, Math, Bit Manipulation
Solution
Problem Understanding
The problem asks us to compute the total Hamming distance between every possible pair of integers in the array.
The Hamming distance between two integers is defined as the number of bit positions where the two numbers differ. For example:
4 = 010014 = 1110
Comparing the bits position by position:
| Bit Position | 4 | 14 | Different? |
|---|---|---|---|
| 8s place | 0 | 1 | Yes |
| 4s place | 1 | 1 | No |
| 2s place | 0 | 1 | Yes |
| 1s place | 0 | 0 | No |
The Hamming distance is 2.
The input is an array nums containing non-negative integers. We must consider every unique pair (i, j) where i < j, compute the Hamming distance between nums[i] and nums[j], and return the sum of all those distances.
The constraints are important:
nums.length <= 10^4nums[i] <= 10^9
An array of size 10^4 contains roughly 50 million pairs in the worst case:
$$\frac{10000 \times 9999}{2}$$
This immediately tells us that a naive pairwise comparison solution may become too slow.
Another important observation is that 10^9 fits within 30 bits, so we only need to examine about 30 to 32 bit positions for each number.
Important edge cases include:
- An array with only one element, the answer must be
0because there are no pairs. - Arrays where all numbers are identical, every Hamming distance is
0. - Arrays where numbers differ in many bit positions, which maximizes the answer.
- Arrays containing
0, since leading zero bits still participate in comparisons.
Approaches
Brute Force Approach
The most straightforward solution is to examine every pair of numbers.
For each pair:
- Compute
a XOR b - Count how many bits are set in the XOR result
- Add that count to the total
This works because XOR produces 1 exactly at positions where the bits differ.
For example:
4 = 010014 = 1110
$$0100 \oplus 1110 = 1010$$
The XOR result has two set bits, so the Hamming distance is 2.
This approach is correct because it directly follows the definition of Hamming distance. However, it is too slow for large inputs because there are O(n^2) pairs.
With 10^4 numbers, this becomes roughly 50 million comparisons, which is inefficient.
Optimal Approach
The key observation is that Hamming distance contributions are independent for each bit position.
Instead of comparing pairs directly, we can process the array one bit at a time.
For a particular bit position:
- Count how many numbers have a
1 - Count how many numbers have a
0
Every combination of one 1 and one 0 contributes exactly 1 to the total Hamming distance.
So the contribution of a bit position is:
$$(\text{count of 1s}) \times (\text{count of 0s})$$
We sum this contribution across all bit positions.
This transforms the problem from pairwise comparison into bit frequency counting.
Since there are at most 32 bits and n numbers, the complexity becomes O(32 * n), which is effectively O(n).
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n² × 32) | O(1) | Compare every pair and count differing bits |
| Optimal | O(32 × n) | O(1) | Count 0s and 1s for each bit position |
Algorithm Walkthrough
Optimal Bit Counting Algorithm
- Initialize a variable
total_distanceto0.
This variable stores the accumulated Hamming distance across all pairs.
2. Iterate through each bit position from 0 to 31.
Since the numbers are at most 10^9, 32 bits are sufficient to represent every value.
3. For the current bit position, count how many numbers contain a 1 at that bit.
We can check this using:
(num >> bit) & 1
This shifts the target bit to the least significant position and extracts it.
4. Compute how many numbers contain a 0 at that bit.
If ones is the count of numbers with a 1, then:
zeros = len(nums) - ones
- Compute the contribution of this bit position.
Every number with a 1 forms a differing pair with every number containing a 0.
Therefore:
$$contribution = ones \times zeros$$
6. Add the contribution to total_distance.
7. After processing all 32 bit positions, return total_distance.
Why it works
For any bit position, two numbers contribute 1 to the Hamming distance if and only if their bits differ at that position.
If there are ones numbers containing 1 and zeros numbers containing 0, then every (1,0) combination contributes exactly once. The number of such combinations is:
$$ones \times zeros$$
Summing this independently for every bit position gives the total Hamming distance across all pairs.
Python Solution
from typing import List
class Solution:
def totalHammingDistance(self, nums: List[int]) -> int:
total_distance = 0
n = len(nums)
for bit in range(32):
ones = 0
for num in nums:
ones += (num >> bit) & 1
zeros = n - ones
total_distance += ones * zeros
return total_distance
The implementation follows the bit-counting strategy directly.
The variable total_distance stores the final answer. We loop through all 32 possible bit positions because integers up to 10^9 fit within that range.
For each bit:
(num >> bit) & 1
extracts the value of the current bit. If the result is 1, we increment ones.
Once all numbers are checked for the current bit, we compute:
zeros = n - ones
The number of differing pairs for this bit is:
ones * zeros
because every 1 pairs with every 0.
Finally, we accumulate the contribution into total_distance and return it after all bits are processed.
Go Solution
func totalHammingDistance(nums []int) int {
totalDistance := 0
n := len(nums)
for bit := 0; bit < 32; bit++ {
ones := 0
for _, num := range nums {
ones += (num >> bit) & 1
}
zeros := n - ones
totalDistance += ones * zeros
}
return totalDistance
}
The Go implementation is almost identical to the Python version.
A few Go-specific details are worth noting:
- Go integers are fixed-width machine integers, but 32 iterations are sufficient because the constraints guarantee values up to
10^9. - The range loop:
for _, num := range nums
iterates through the slice efficiently.
- No special handling for empty or nil slices is required because the constraints guarantee at least one element.
- The result safely fits within a 32-bit integer according to the problem statement.
Worked Examples
Example 1
Input:
nums = [4, 14, 2]
Binary representations:
| Number | Binary |
|---|---|
| 4 | 0100 |
| 14 | 1110 |
| 2 | 0010 |
We process each bit independently.
Bit Position 0
| Number | Bit 0 |
|---|---|
| 4 | 0 |
| 14 | 0 |
| 2 | 0 |
| Ones | Zeros | Contribution |
|---|---|---|
| 0 | 3 | 0 |
Running total: 0
Bit Position 1
| Number | Bit 1 |
|---|---|
| 4 | 0 |
| 14 | 1 |
| 2 | 1 |
| Ones | Zeros | Contribution |
|---|---|---|
| 2 | 1 | 2 |
Running total: 2
Bit Position 2
| Number | Bit 2 |
|---|---|
| 4 | 1 |
| 14 | 1 |
| 2 | 0 |
| Ones | Zeros | Contribution |
|---|---|---|
| 2 | 1 | 2 |
Running total: 4
Bit Position 3
| Number | Bit 3 |
|---|---|
| 4 | 0 |
| 14 | 1 |
| 2 | 0 |
| Ones | Zeros | Contribution |
|---|---|---|
| 1 | 2 | 2 |
Running total: 6
All higher bits contribute 0.
Final answer:
6
Example 2
Input:
nums = [4, 14, 4]
Binary representations:
| Number | Binary |
|---|---|
| 4 | 0100 |
| 14 | 1110 |
| 4 | 0100 |
Bit Position Contributions
| Bit | Ones | Zeros | Contribution |
|---|---|---|---|
| 0 | 0 | 3 | 0 |
| 1 | 1 | 2 | 2 |
| 2 | 3 | 0 | 0 |
| 3 | 1 | 2 | 2 |
Total:
0 + 2 + 0 + 2 = 4
Final answer:
4
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(32 × n) | We examine 32 bits for every number |
| Space | O(1) | Only a few counters are used |
The algorithm processes every number once for each of the 32 bit positions. Since 32 is constant, the complexity simplifies to O(n).
The space complexity is constant because no auxiliary data structures proportional to input size are allocated.
Test Cases
from typing import List
class Solution:
def totalHammingDistance(self, nums: List[int]) -> int:
total_distance = 0
n = len(nums)
for bit in range(32):
ones = 0
for num in nums:
ones += (num >> bit) & 1
zeros = n - ones
total_distance += ones * zeros
return total_distance
solution = Solution()
assert solution.totalHammingDistance([4, 14, 2]) == 6
# Example 1 from the problem statement
assert solution.totalHammingDistance([4, 14, 4]) == 4
# Example 2 from the problem statement
assert solution.totalHammingDistance([0]) == 0
# Single element, no pairs exist
assert solution.totalHammingDistance([0, 0, 0]) == 0
# All numbers identical
assert solution.totalHammingDistance([1, 2]) == 2
# 01 vs 10 differ in both bits
assert solution.totalHammingDistance([1, 1, 2, 2]) == 8
# Multiple duplicate values
assert solution.totalHammingDistance([0, 1]) == 1
# Simplest differing pair
assert solution.totalHammingDistance([7, 7, 7, 7]) == 0
# All bits identical across all numbers
assert solution.totalHammingDistance([8, 4, 2, 1]) == 12
# Every number differs significantly
assert solution.totalHammingDistance([0, 1073741823]) == 30
# Maximum 30-bit difference
Test Case Summary
| Test | Why |
|---|---|
[4,14,2] |
Validates the primary example |
[4,14,4] |
Verifies duplicate handling |
[0] |
Tests single-element edge case |
[0,0,0] |
Ensures identical numbers produce zero |
[1,2] |
Simple binary difference case |
[1,1,2,2] |
Tests repeated values |
[0,1] |
Smallest non-zero distance |
[7,7,7,7] |
Confirms no false contributions |
[8,4,2,1] |
Checks multiple differing bit positions |
[0,1073741823] |
Stress test with many differing bits |
Edge Cases
Single Element Array
If the array contains only one number, there are no pairs to compare. A naive implementation might accidentally attempt pair generation or produce incorrect counts.
This implementation handles the case naturally because every bit position produces either:
ones = 0
zeros = 1
or:
ones = 1
zeros = 0
Both produce a contribution of 0.
All Numbers Identical
When every number is the same, the Hamming distance between every pair must be 0.
This can expose bugs in implementations that incorrectly count repeated pairs or mishandle bit extraction.
In the bit-counting solution, every bit position has either:
- all
1s - all
0s`
Therefore:
$$ones \times zeros = 0$$
for every bit position.
Large Bit Differences
Arrays containing numbers with many differing bits can produce large totals.
For example:
[0, 1073741823]
Here:
1073741823 = 111111111111111111111111111111
The Hamming distance is 30.
This verifies that the implementation correctly processes higher bit positions and does not stop prematurely.
Arrays With Duplicates
Duplicate values are important because they contribute zero distance with each other but still contribute normally with other numbers.
For example:
[4, 14, 4]
The algorithm correctly handles duplicates because each bit contribution depends only on counts of zeros and ones, not on uniqueness of values.