LeetCode 1722 - Minimize Hamming Distance After Swap Operations

Difficulty: 🟡 Medium
Topics: Array, Depth-First Search, Union-Find

Solution

Problem Understanding

The problem asks us to find the minimum Hamming distance between two arrays, source and target, after performing any number of swaps on source at positions allowed by allowedSwaps. The Hamming distance is defined as the number of indices i for which source[i] != target[i].

The input consists of two integer arrays of length n and a list of allowed swaps. Each allowed swap is a pair of indices [ai, bi], meaning elements at these positions can be swapped any number of times and in any order. The goal is not to output the final array but to compute the minimum possible Hamming distance achievable by exploiting these swaps.

The constraints indicate n and the length of allowedSwaps can be up to 10^5, so naive brute-force approaches that try all possible swap permutations are infeasible. We need a solution that scales linearly or near-linearly. Important edge cases include when allowedSwaps is empty (no swaps allowed), when all elements are already matching (minimum distance is zero), or when source and target have duplicates that can be rearranged.

The key observation is that swaps define connected components in the array. Within each connected component, the elements can be permuted freely. Therefore, the minimum Hamming distance is determined by comparing the multiset of values in each component of source with the corresponding values in target.

Approaches

The brute-force approach would attempt to simulate all possible sequences of swaps, generating every permutation of source allowed by allowedSwaps, then computing Hamming distances for each. This approach is correct but impractical because the number of permutations grows exponentially with the size of components.

The optimal approach uses Union-Find (Disjoint Set Union) to group indices into connected components according to allowedSwaps. Within each component, we can freely rearrange elements of source. For each component, we count the occurrences of elements in source and target, then subtract the matching counts to compute the minimum mismatches. This avoids enumerating permutations and works efficiently for large inputs.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O((n!)^k)	O(n)	Generates all permutations of connected components; infeasible
Optimal	O(n + m)	O(n)	Uses Union-Find to group indices, counts mismatches per component, scalable

Here, m is the number of allowed swaps.

Algorithm Walkthrough

Initialize Union-Find for all indices in source. Each index starts as its own parent.
Process allowed swaps. For each [ai, bi], union the sets of ai and bi. This groups indices into connected components where swaps are allowed.
Group indices by component. After all unions, iterate through each index and map the root parent to a list of indices in that component.
Count element frequencies per component. For each component, count the number of occurrences of each value in source and target.
Compute Hamming distance contribution per component. For each value, calculate the difference between the counts in source and target. Sum the excess occurrences to get mismatches in that component.
Sum mismatches across components. The sum is the minimum Hamming distance after any allowed swaps.

Why it works: Within each connected component, any permutation of source values is achievable. Comparing the counts of values in source and target within the component guarantees that we account for all possible swaps efficiently. Any remaining mismatch after aligning counts corresponds to positions that cannot be corrected through swaps.

Python Solution

from collections import defaultdict, Counter
from typing import List

class Solution:
    def minimumHammingDistance(self, source: List[int], target: List[int], allowedSwaps: List[List[int]]) -> int:
        n = len(source)
        
        # Union-Find setup
        parent = list(range(n))
        
        def find(x):
            while parent[x] != x:
                parent[x] = parent[parent[x]]
                x = parent[x]
            return x
        
        def union(x, y):
            px, py = find(x), find(y)
            if px != py:
                parent[py] = px
        
        # Union allowed swaps
        for a, b in allowedSwaps:
            union(a, b)
        
        # Group indices by component
        components = defaultdict(list)
        for i in range(n):
            root = find(i)
            components[root].append(i)
        
        # Count mismatches per component
        result = 0
        for indices in components.values():
            src_count = Counter(source[i] for i in indices)
            tgt_count = Counter(target[i] for i in indices)
            for val in src_count:
                if val in tgt_count:
                    matched = min(src_count[val], tgt_count[val])
                    src_count[val] -= matched
                    tgt_count[val] -= matched
            # Remaining counts in src_count contribute to Hamming distance
            result += sum(src_count.values())
        
        return result

Implementation Notes: We use path compression in find to optimize Union-Find. Components are grouped via a dictionary mapping root to index lists. Counting mismatches with Counter ensures correct handling of duplicates.

Go Solution

package main

func minimumHammingDistance(source []int, target []int, allowedSwaps [][]int) int {
    n := len(source)
    parent := make([]int, n)
    for i := range parent {
        parent[i] = i
    }

    var find func(int) int
    find = func(x int) int {
        if parent[x] != x {
            parent[x] = find(parent[x])
        }
        return parent[x]
    }

    union := func(x, y int) {
        px, py := find(x), find(y)
        if px != py {
            parent[py] = px
        }
    }

    for _, swap := range allowedSwaps {
        union(swap[0], swap[1])
    }

    components := make(map[int][]int)
    for i := 0; i < n; i++ {
        root := find(i)
        components[root] = append(components[root], i)
    }

    result := 0
    for _, indices := range components {
        srcCount := make(map[int]int)
        tgtCount := make(map[int]int)
        for _, idx := range indices {
            srcCount[source[idx]]++
            tgtCount[target[idx]]++
        }
        for val, count := range srcCount {
            if tgtCount[val] > 0 {
                matched := min(count, tgtCount[val])
                srcCount[val] -= matched
            }
        }
        for _, count := range srcCount {
            result += count
        }
    }
    return result
}

func min(a, b int) int {
    if a < b {
        return a
    }
    return b
}

Go-Specific Notes: Go lacks a built-in Counter like Python, so we use maps. Union-Find is implemented with recursion and path compression. We handle slices and maps carefully, avoiding nil vs empty differences.

Worked Examples

Example 1

source = [1,2,3,4], target = [2,1,4,5], allowedSwaps = [[0,1],[2,3]]

Union-Find components:

Index	Root
0	0
1	0
2	2
3	2

Components: {0: [0,1], 2: [2,3]}

Counts:

Component [0,1]: source {1:1,2:1}, target {2:1,1:1} → all matched → 0 mismatches
Component [2,3]: source {3:1,4:1}, target {4:1,5:1} → 3 in source unmatched → 1 mismatch

Result: 1

Example 2

source = [1,2,3,4], target = [1,3,2,4], allowedSwaps = []

Each index is its own component. Compare source[i] vs target[i]:

Indices 0: 1==1 → matched
Index 1: 2!=3 → mismatch
Index 2: 3!=2 → mismatch
Index 3: 4==4 → matched

Result: 2

Example 3

source = [5,1,2,4,3], target = [1,5,4,2,3], allowedSwaps = [[0,4],[4,2],[1,3],[1,4]]

All indices belong to a single connected component due to swaps. Count multisets:

source counts: {5:1,1:1,2:1,4:1,3:1}
target counts: {1:1,5:1,4:1,2:1,3:1} → all counts matched → 0 mismatches

Result: 0

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n + m)	n for iterating indices, m for processing allowedSwaps, counting per component is linear
Space	O(n)	parent array