LeetCode 477 - Total Hamming Distance

Difficulty: 🟡 Medium
Topics: Array, Math, Bit Manipulation

Solution

Problem Understanding

The problem asks us to compute the total Hamming distance between every possible pair of integers in the array.

The Hamming distance between two integers is defined as the number of bit positions where the two numbers differ. For example:

4 = 0100
14 = 1110

Comparing the bits position by position:

Bit Position	4	14	Different?
8s place	0	1	Yes
4s place	1	1	No
2s place	0	1	Yes
1s place	0	0	No

The Hamming distance is 2.

The input is an array nums containing non-negative integers. We must consider every unique pair (i, j) where i < j, compute the Hamming distance between nums[i] and nums[j], and return the sum of all those distances.

The constraints are important:

nums.length <= 10^4
nums[i] <= 10^9

An array of size 10^4 contains roughly 50 million pairs in the worst case:

$$\frac{10000 \times 9999}{2}$$

This immediately tells us that a naive pairwise comparison solution may become too slow.

Another important observation is that 10^9 fits within 30 bits, so we only need to examine about 30 to 32 bit positions for each number.

Important edge cases include:

An array with only one element, the answer must be 0 because there are no pairs.
Arrays where all numbers are identical, every Hamming distance is 0.
Arrays where numbers differ in many bit positions, which maximizes the answer.
Arrays containing 0, since leading zero bits still participate in comparisons.

Approaches

Brute Force Approach

The most straightforward solution is to examine every pair of numbers.

For each pair:

Compute a XOR b
Count how many bits are set in the XOR result
Add that count to the total

This works because XOR produces 1 exactly at positions where the bits differ.

For example:

4 = 0100
14 = 1110

$$0100 \oplus 1110 = 1010$$

The XOR result has two set bits, so the Hamming distance is 2.

This approach is correct because it directly follows the definition of Hamming distance. However, it is too slow for large inputs because there are O(n^2) pairs.

With 10^4 numbers, this becomes roughly 50 million comparisons, which is inefficient.

Optimal Approach

The key observation is that Hamming distance contributions are independent for each bit position.

Instead of comparing pairs directly, we can process the array one bit at a time.

For a particular bit position:

Count how many numbers have a 1
Count how many numbers have a 0

Every combination of one 1 and one 0 contributes exactly 1 to the total Hamming distance.

So the contribution of a bit position is:

$$(\text{count of 1s}) \times (\text{count of 0s})$$

We sum this contribution across all bit positions.

This transforms the problem from pairwise comparison into bit frequency counting.

Since there are at most 32 bits and n numbers, the complexity becomes O(32 * n), which is effectively O(n).

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n² × 32)	O(1)	Compare every pair and count differing bits
Optimal	O(32 × n)	O(1)	Count 0s and 1s for each bit position

Algorithm Walkthrough

Optimal Bit Counting Algorithm

Initialize a variable total_distance to 0.

This variable stores the accumulated Hamming distance across all pairs. 2. Iterate through each bit position from 0 to 31.

Since the numbers are at most 10^9, 32 bits are sufficient to represent every value. 3. For the current bit position, count how many numbers contain a 1 at that bit.

We can check this using:

(num >> bit) & 1

This shifts the target bit to the least significant position and extracts it. 4. Compute how many numbers contain a 0 at that bit.

If ones is the count of numbers with a 1, then:

zeros = len(nums) - ones

Compute the contribution of this bit position.

Every number with a 1 forms a differing pair with every number containing a 0.

Therefore:

$$contribution = ones \times zeros$$ 6. Add the contribution to total_distance. 7. After processing all 32 bit positions, return total_distance.

Why it works

For any bit position, two numbers contribute 1 to the Hamming distance if and only if their bits differ at that position.

If there are ones numbers containing 1 and zeros numbers containing 0, then every (1,0) combination contributes exactly once. The number of such combinations is:

$$ones \times zeros$$

Summing this independently for every bit position gives the total Hamming distance across all pairs.

Python Solution

from typing import List

class Solution:
    def totalHammingDistance(self, nums: List[int]) -> int:
        total_distance = 0
        n = len(nums)

        for bit in range(32):
            ones = 0

            for num in nums:
                ones += (num >> bit) & 1

            zeros = n - ones
            total_distance += ones * zeros

        return total_distance

The implementation follows the bit-counting strategy directly.

The variable total_distance stores the final answer. We loop through all 32 possible bit positions because integers up to 10^9 fit within that range.

For each bit:

(num >> bit) & 1

extracts the value of the current bit. If the result is 1, we increment ones.

Once all numbers are checked for the current bit, we compute:

zeros = n - ones

The number of differing pairs for this bit is:

ones * zeros

because every 1 pairs with every 0.

Finally, we accumulate the contribution into total_distance and return it after all bits are processed.

Go Solution

func totalHammingDistance(nums []int) int {
    totalDistance := 0
    n := len(nums)

    for bit := 0; bit < 32; bit++ {
        ones := 0

        for _, num := range nums {
            ones += (num >> bit) & 1
        }

        zeros := n - ones
        totalDistance += ones * zeros
    }

    return totalDistance
}

The Go implementation is almost identical to the Python version.

A few Go-specific details are worth noting:

Go integers are fixed-width machine integers, but 32 iterations are sufficient because the constraints guarantee values up to 10^9.
The range loop:

for _, num := range nums

iterates through the slice efficiently.

No special handling for empty or nil slices is required because the constraints guarantee at least one element.
The result safely fits within a 32-bit integer according to the problem statement.

Worked Examples

Example 1

Input:

nums = [4, 14, 2]

Binary representations:

Number	Binary
4	0100
14	1110
2	0010

We process each bit independently.

Bit Position 0

Number	Bit 0
4	0
14	0
2	0

Ones	Zeros	Contribution
0	3	0

Running total: 0

Bit Position 1

Number	Bit 1
4	0
14	1
2	1

Ones	Zeros	Contribution
2	1	2

Running total: 2

Bit Position 2

Number	Bit 2
4	1
14	1
2	0

Ones	Zeros	Contribution
2	1	2

Running total: 4

Bit Position 3

Number	Bit 3
4	0
14	1
2	0

Ones	Zeros	Contribution
1	2	2

Running total: 6

All higher bits contribute 0.

Final answer:

Example 2

Input:

nums = [4, 14, 4]

Binary representations:

Number	Binary
4	0100
14	1110
4	0100

Bit Position Contributions

Bit	Ones	Zeros	Contribution
0	0	3	0
1	1	2	2
2	3	0	0
3	1	2	2

Total:

0 + 2 + 0 + 2 = 4

Final answer:

Complexity Analysis

Measure	Complexity	Explanation
Time	O(32 × n)	We examine 32 bits for every number
Space	O(1)	Only a few counters are used

The algorithm processes every number once for each of the 32 bit positions. Since 32 is constant, the complexity simplifies to O(n).

The space complexity is constant because no auxiliary data structures proportional to input size are allocated.

Test Cases

from typing import List

class Solution:
    def totalHammingDistance(self, nums: List[int]) -> int:
        total_distance = 0
        n = len(nums)

        for bit in range(32):
            ones = 0

            for num in nums:
                ones += (num >> bit) & 1

            zeros = n - ones
            total_distance += ones * zeros

        return total_distance

solution = Solution()

assert solution.totalHammingDistance([4, 14, 2]) == 6
# Example 1 from the problem statement

assert solution.totalHammingDistance([4, 14, 4]) == 4
# Example 2 from the problem statement

assert solution.totalHammingDistance([0]) == 0
# Single element, no pairs exist

assert solution.totalHammingDistance([0, 0, 0]) == 0
# All numbers identical

assert solution.totalHammingDistance([1, 2]) == 2
# 01 vs 10 differ in both bits

assert solution.totalHammingDistance([1, 1, 2, 2]) == 8
# Multiple duplicate values

assert solution.totalHammingDistance([0, 1]) == 1
# Simplest differing pair

assert solution.totalHammingDistance([7, 7, 7, 7]) == 0
# All bits identical across all numbers

assert solution.totalHammingDistance([8, 4, 2, 1]) == 12
# Every number differs significantly

assert solution.totalHammingDistance([0, 1073741823]) == 30
# Maximum 30-bit difference

Test Case Summary

Test	Why
`[4,14,2]`	Validates the primary example
`[4,14,4]`	Verifies duplicate handling
`[0]`	Tests single-element edge case
`[0,0,0]`	Ensures identical numbers produce zero
`[1,2]`	Simple binary difference case
`[1,1,2,2]`	Tests repeated values
`[0,1]`	Smallest non-zero distance
`[7,7,7,7]`	Confirms no false contributions
`[8,4,2,1]`	Checks multiple differing bit positions
`[0,1073741823]`	Stress test with many differing bits

Edge Cases

Single Element Array

If the array contains only one number, there are no pairs to compare. A naive implementation might accidentally attempt pair generation or produce incorrect counts.

This implementation handles the case naturally because every bit position produces either:

ones = 0
zeros = 1

or:

ones = 1
zeros = 0

Both produce a contribution of 0.

All Numbers Identical

When every number is the same, the Hamming distance between every pair must be 0.

This can expose bugs in implementations that incorrectly count repeated pairs or mishandle bit extraction.

In the bit-counting solution, every bit position has either:

all 1s
all 0s`

Therefore:

$$ones \times zeros = 0$$

for every bit position.

Large Bit Differences

Arrays containing numbers with many differing bits can produce large totals.

For example:

[0, 1073741823]

Here:

1073741823 = 111111111111111111111111111111

The Hamming distance is 30.

This verifies that the implementation correctly processes higher bit positions and does not stop prematurely.

Arrays With Duplicates

Duplicate values are important because they contribute zero distance with each other but still contribute normally with other numbers.

For example:

[4, 14, 4]

The algorithm correctly handles duplicates because each bit contribution depends only on counts of zeros and ones, not on uniqueness of values.