LeetCode 1722 - Minimize Hamming Distance After Swap Operations
The problem asks us to find the minimum Hamming distance between two arrays, source and target, after performing any number of swaps on source at positions allowed by allowedSwaps. The Hamming distance is defined as the number of indices i for which source[i] != target[i].
Difficulty: 🟡 Medium
Topics: Array, Depth-First Search, Union-Find
Solution
Problem Understanding
The problem asks us to find the minimum Hamming distance between two arrays, source and target, after performing any number of swaps on source at positions allowed by allowedSwaps. The Hamming distance is defined as the number of indices i for which source[i] != target[i].
The input consists of two integer arrays of length n and a list of allowed swaps. Each allowed swap is a pair of indices [ai, bi], meaning elements at these positions can be swapped any number of times and in any order. The goal is not to output the final array but to compute the minimum possible Hamming distance achievable by exploiting these swaps.
The constraints indicate n and the length of allowedSwaps can be up to 10^5, so naive brute-force approaches that try all possible swap permutations are infeasible. We need a solution that scales linearly or near-linearly. Important edge cases include when allowedSwaps is empty (no swaps allowed), when all elements are already matching (minimum distance is zero), or when source and target have duplicates that can be rearranged.
The key observation is that swaps define connected components in the array. Within each connected component, the elements can be permuted freely. Therefore, the minimum Hamming distance is determined by comparing the multiset of values in each component of source with the corresponding values in target.
Approaches
The brute-force approach would attempt to simulate all possible sequences of swaps, generating every permutation of source allowed by allowedSwaps, then computing Hamming distances for each. This approach is correct but impractical because the number of permutations grows exponentially with the size of components.
The optimal approach uses Union-Find (Disjoint Set Union) to group indices into connected components according to allowedSwaps. Within each component, we can freely rearrange elements of source. For each component, we count the occurrences of elements in source and target, then subtract the matching counts to compute the minimum mismatches. This avoids enumerating permutations and works efficiently for large inputs.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O((n!)^k) | O(n) | Generates all permutations of connected components; infeasible |
| Optimal | O(n + m) | O(n) | Uses Union-Find to group indices, counts mismatches per component, scalable |
Here, m is the number of allowed swaps.
Algorithm Walkthrough
- Initialize Union-Find for all indices in
source. Each index starts as its own parent. - Process allowed swaps. For each
[ai, bi], union the sets ofaiandbi. This groups indices into connected components where swaps are allowed. - Group indices by component. After all unions, iterate through each index and map the root parent to a list of indices in that component.
- Count element frequencies per component. For each component, count the number of occurrences of each value in
sourceandtarget. - Compute Hamming distance contribution per component. For each value, calculate the difference between the counts in
sourceandtarget. Sum the excess occurrences to get mismatches in that component. - Sum mismatches across components. The sum is the minimum Hamming distance after any allowed swaps.
Why it works: Within each connected component, any permutation of source values is achievable. Comparing the counts of values in source and target within the component guarantees that we account for all possible swaps efficiently. Any remaining mismatch after aligning counts corresponds to positions that cannot be corrected through swaps.
Python Solution
from collections import defaultdict, Counter
from typing import List
class Solution:
def minimumHammingDistance(self, source: List[int], target: List[int], allowedSwaps: List[List[int]]) -> int:
n = len(source)
# Union-Find setup
parent = list(range(n))
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(x, y):
px, py = find(x), find(y)
if px != py:
parent[py] = px
# Union allowed swaps
for a, b in allowedSwaps:
union(a, b)
# Group indices by component
components = defaultdict(list)
for i in range(n):
root = find(i)
components[root].append(i)
# Count mismatches per component
result = 0
for indices in components.values():
src_count = Counter(source[i] for i in indices)
tgt_count = Counter(target[i] for i in indices)
for val in src_count:
if val in tgt_count:
matched = min(src_count[val], tgt_count[val])
src_count[val] -= matched
tgt_count[val] -= matched
# Remaining counts in src_count contribute to Hamming distance
result += sum(src_count.values())
return result
Implementation Notes: We use path compression in find to optimize Union-Find. Components are grouped via a dictionary mapping root to index lists. Counting mismatches with Counter ensures correct handling of duplicates.
Go Solution
package main
func minimumHammingDistance(source []int, target []int, allowedSwaps [][]int) int {
n := len(source)
parent := make([]int, n)
for i := range parent {
parent[i] = i
}
var find func(int) int
find = func(x int) int {
if parent[x] != x {
parent[x] = find(parent[x])
}
return parent[x]
}
union := func(x, y int) {
px, py := find(x), find(y)
if px != py {
parent[py] = px
}
}
for _, swap := range allowedSwaps {
union(swap[0], swap[1])
}
components := make(map[int][]int)
for i := 0; i < n; i++ {
root := find(i)
components[root] = append(components[root], i)
}
result := 0
for _, indices := range components {
srcCount := make(map[int]int)
tgtCount := make(map[int]int)
for _, idx := range indices {
srcCount[source[idx]]++
tgtCount[target[idx]]++
}
for val, count := range srcCount {
if tgtCount[val] > 0 {
matched := min(count, tgtCount[val])
srcCount[val] -= matched
}
}
for _, count := range srcCount {
result += count
}
}
return result
}
func min(a, b int) int {
if a < b {
return a
}
return b
}
Go-Specific Notes: Go lacks a built-in Counter like Python, so we use maps. Union-Find is implemented with recursion and path compression. We handle slices and maps carefully, avoiding nil vs empty differences.
Worked Examples
Example 1
source = [1,2,3,4], target = [2,1,4,5], allowedSwaps = [[0,1],[2,3]]
Union-Find components:
| Index | Root |
|---|---|
| 0 | 0 |
| 1 | 0 |
| 2 | 2 |
| 3 | 2 |
Components: {0: [0,1], 2: [2,3]}
Counts:
- Component [0,1]: source {1:1,2:1}, target {2:1,1:1} → all matched → 0 mismatches
- Component [2,3]: source {3:1,4:1}, target {4:1,5:1} → 3 in source unmatched → 1 mismatch
Result: 1
Example 2
source = [1,2,3,4], target = [1,3,2,4], allowedSwaps = []
Each index is its own component. Compare source[i] vs target[i]:
- Indices 0: 1==1 → matched
- Index 1: 2!=3 → mismatch
- Index 2: 3!=2 → mismatch
- Index 3: 4==4 → matched
Result: 2
Example 3
source = [5,1,2,4,3], target = [1,5,4,2,3], allowedSwaps = [[0,4],[4,2],[1,3],[1,4]]
All indices belong to a single connected component due to swaps. Count multisets:
- source counts: {5:1,1:1,2:1,4:1,3:1}
- target counts: {1:1,5:1,4:1,2:1,3:1} → all counts matched → 0 mismatches
Result: 0
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + m) | n for iterating indices, m for processing allowedSwaps, counting per component is linear |
| Space | O(n) | parent array |