LeetCode 2499 - Minimum Total Cost to Make Arrays Unequal

Difficulty: 🔴 Hard
Topics: Array, Hash Table, Greedy, Counting

Solution

Problem Understanding

You are given two arrays, nums1 and nums2, both of length n. You are allowed to perform operations only on nums1. In a single operation, you may swap any two indices in nums1, and the cost of that operation is the sum of the two indices involved in the swap.

The goal is to rearrange nums1 so that for every index i, the condition:

nums1[i] != nums2[i]

holds true.

Among all possible valid rearrangements, you must compute the minimum total swap cost. If no valid rearrangement exists, return -1.

The important detail is that swaps happen only inside nums1. The array nums2 never changes.

The constraints are large:

1 <= n <= 100000

This immediately rules out brute force permutation generation or graph search over all swap states. Any acceptable solution must run in approximately linear time.

The most important observation is that only indices where:

nums1[i] == nums2[i]

are problematic. These positions already violate the requirement and must be fixed.

Another subtle point is that some values may appear very frequently among the problematic positions. If one value dominates too many conflict positions, it may become impossible to rearrange the array without creating another conflict somewhere else.

Several edge cases are especially important:

Arrays that already satisfy the condition should return 0.
Arrays where all values are identical may be impossible.
Cases where one value dominates the conflict positions require additional helper indices.
Very small arrays, especially size 1, can easily become impossible.

Approaches

Brute Force Approach

A brute force solution would attempt to generate all possible rearrangements of nums1 obtainable through swaps and check whether the resulting array satisfies:

nums1[i] != nums2[i]

for every index.

For each valid arrangement, we could compute the minimum swap cost required to transform the original array into that arrangement.

This approach is theoretically correct because it explores every possible final configuration. However, the number of permutations grows factorially:

O(n!)

Even for relatively small n, this becomes computationally impossible. Since the constraint allows up to 100000 elements, brute force is entirely infeasible.

Key Insight Behind the Optimal Solution

The crucial observation is that we only care about indices where:

nums1[i] == nums2[i]

These indices are already invalid and must participate in swaps.

Suppose we collect all such indices into a set of "bad positions". If we swap values among these bad positions carefully, we can often resolve all conflicts.

However, another issue appears when one value occurs too frequently among the bad positions. Let:

totalBad = number of bad positions
maxFreq = highest frequency of any value among bad positions

If:

maxFreq * 2 > totalBad

then the dominant value cannot be distributed safely among only the bad positions. We must recruit additional indices from outside the bad set.

Those extra indices must satisfy:

nums1[i] != nums2[i]

already, and also:

nums1[i] != dominantValue
nums2[i] != dominantValue

so they can safely absorb the excess dominant occurrences.

The greedy strategy is to always use the smallest possible indices, because swap cost depends directly on index sums.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n!)	O(n!)	Explores all permutations/swaps
Optimal	O(n)	O(n)	Greedy counting with frequency tracking

Algorithm Walkthrough

Step 1: Identify conflicting indices

Traverse the arrays and collect every index i where:

nums1[i] == nums2[i]

These are the positions that currently violate the requirement.

For each such index:

Add its index to the total cost.
Count the frequency of its value.
Track which value appears most frequently.

We maintain:

total_swaps_needed, the number of conflicting indices
cost, initialized as the sum of conflicting indices
frequency[value]
dominant_value
dominant_count

Step 2: Check whether the dominant value is problematic

If the dominant value appears too many times among the conflicting positions, swaps only within the conflicting set may still leave collisions.

Specifically, if:

dominant_count * 2 > total_swaps_needed

then we need extra helper indices.

The number of additional indices required is:

required = dominant_count * 2 - total_swaps_needed

Step 3: Find helper indices

Traverse all indices again.

We want indices satisfying:

nums1[i] != nums2[i]
nums1[i] != dominant_value
nums2[i] != dominant_value

These indices are already valid, and neither side contains the dominant value. They can safely participate in swaps.

For each helper index:

Add its index to the total cost
Decrease required

We greedily take indices in increasing order, minimizing total cost.

Step 4: Verify feasibility

If after scanning all indices:

required > 0

then it is impossible to eliminate all conflicts.

Return -1.

Otherwise, return the accumulated cost.

Why it works

The algorithm works because every conflicting index must participate in at least one swap. Therefore, their indices must contribute to the final cost.

When one value dominates the conflicting set, rearranging only inside that set becomes impossible due to pigeonhole constraints. Additional helper indices are mathematically necessary to separate repeated dominant values.

The greedy choice of using the smallest valid helper indices minimizes the additional cost while preserving correctness.

Python Solution

from typing import List
from collections import defaultdict

class Solution:
    def minimumTotalCost(self, nums1: List[int], nums2: List[int]) -> int:
        frequency = defaultdict(int)

        total_conflicts = 0
        total_cost = 0

        dominant_value = -1
        dominant_count = 0

        n = len(nums1)

        for i in range(n):
            if nums1[i] == nums2[i]:
                value = nums1[i]

                total_conflicts += 1
                total_cost += i

                frequency[value] += 1

                if frequency[value] > dominant_count:
                    dominant_count = frequency[value]
                    dominant_value = value

        required = dominant_count * 2 - total_conflicts

        for i in range(n):
            if required <= 0:
                break

            if nums1[i] != nums2[i]:
                if nums1[i] != dominant_value and nums2[i] != dominant_value:
                    total_cost += i
                    required -= 1

        if required > 0:
            return -1

        return total_cost

The implementation begins by scanning all indices and identifying conflicts where nums1[i] == nums2[i]. Each conflicting index must eventually participate in a swap, so its index is immediately added to the total cost.

At the same time, we maintain a frequency map of values appearing in conflicting positions. This allows us to determine the dominant value, meaning the value that appears most frequently among conflicts.

After the first pass, we compute how many additional helper indices are required. If the dominant value is not overly frequent, no extra work is needed.

If helper indices are necessary, the second pass searches for indices already satisfying the inequality condition and not containing the dominant value on either side. These indices safely absorb excess dominant occurrences.

Finally, if enough helper indices cannot be found, the configuration is impossible and we return -1.

Go Solution

package main

func minimumTotalCost(nums1 []int, nums2 []int) int64 {
	frequency := make(map[int]int)

	totalConflicts := 0
	var totalCost int64 = 0

	dominantValue := -1
	dominantCount := 0

	n := len(nums1)

	for i := 0; i < n; i++ {
		if nums1[i] == nums2[i] {
			value := nums1[i]

			totalConflicts++
			totalCost += int64(i)

			frequency[value]++

			if frequency[value] > dominantCount {
				dominantCount = frequency[value]
				dominantValue = value
			}
		}
	}

	required := dominantCount*2 - totalConflicts

	for i := 0; i < n && required > 0; i++ {
		if nums1[i] != nums2[i] &&
			nums1[i] != dominantValue &&
			nums2[i] != dominantValue {

			totalCost += int64(i)
			required--
		}
	}

	if required > 0 {
		return -1
	}

	return totalCost
}

The Go implementation follows the same logic as the Python solution. A map[int]int replaces Python's defaultdict.

One important difference is that the return type is int64, because the sum of indices can exceed the range of a 32 bit integer when n is large.

The algorithm itself remains identical.

Worked Examples

Example 1

nums1 = [1,2,3,4,5]
nums2 = [1,2,3,4,5]

All positions conflict.

Index	nums1[i]	nums2[i]	Conflict	Frequency
0	1	1	Yes	1:1
1	2	2	Yes	2:1
2	3	3	Yes	3:1
3	4	4	Yes	4:1
4	5	5	Yes	5:1

After the first pass:

total_conflicts = 5
total_cost = 0 + 1 + 2 + 3 + 4 = 10
dominant_count = 1

Compute:

required = 1 * 2 - 5 = -3

No helper indices are needed.

Final answer:

Example 2

nums1 = [2,2,2,1,3]
nums2 = [1,2,2,3,3]

Conflicting indices:

Index	nums1[i]	nums2[i]
1	2	2
2	2	2
4	3	3

Initial state:

total_conflicts = 3
total_cost = 1 + 2 + 4 = 7
frequency[2] = 2
frequency[3] = 1
dominant_value = 2
dominant_count = 2

Compute:

required = 2 * 2 - 3 = 1

We need one helper index.

Scan non-conflicting indices:

Index	nums1[i]	nums2[i]	Valid Helper
0	2	1	No
3	1	3	Yes

Take index 3.

Updated cost:

7 + 3 = 10

Final answer:

Example 3

nums1 = [1,2,2]
nums2 = [1,2,2]

All positions conflict.

Value	Frequency
1	1
2	2

State:

total_conflicts = 3
total_cost = 0 + 1 + 2 = 3
dominant_count = 2

Compute:

required = 2 * 2 - 3 = 1

No non-conflicting indices exist, so we cannot find a helper index.

Therefore:

return -1

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Two linear scans through the arrays
Space	O(n)	Frequency hash map may store up to n values

The algorithm performs only two passes over the input arrays. Each operation inside the loops is constant time on average due to hash map access.

The extra memory comes from storing frequencies of conflicting values.

Test Cases

sol = Solution()

# Provided examples
assert sol.minimumTotalCost([1,2,3,4,5], [1,2,3,4,5]) == 10  # all positions conflict
assert sol.minimumTotalCost([2,2,2,1,3], [1,2,2,3,3]) == 10  # requires helper index
assert sol.minimumTotalCost([1,2,2], [1,2,2]) == -1  # impossible case

# Already valid arrays
assert sol.minimumTotalCost([1,2,3], [3,1,2]) == 0  # no conflicts initially

# Single element impossible
assert sol.minimumTotalCost([1], [1]) == -1  # cannot fix single conflict

# Single element already valid
assert sol.minimumTotalCost([1], [2]) == 0  # already satisfies condition

# Dominant value but solvable
assert sol.minimumTotalCost([1,1,1,2], [1,1,2,3]) == 1  # one helper needed

# Multiple helper indices needed
assert sol.minimumTotalCost([2,2,2,2,1,3], [2,2,2,1,2,4]) == 10  # several helpers

# No conflicts
assert sol.minimumTotalCost([4,5,6], [1,2,3]) == 0  # no swaps required

# Large repeated values impossible
assert sol.minimumTotalCost([7,7,7,7], [7,7,7,7]) == -1  # impossible dominance

# Mixed frequencies
assert sol.minimumTotalCost([1,2,1,2,3], [1,3,1,4,5]) == 2  # minimal helper usage

Test Summary

Test	Why
All positions conflict	Validates base conflict handling
Requires helper index	Tests dominant value balancing
Impossible repeated value case	Ensures correct `-1` detection
Already valid arrays	Confirms zero-cost handling
Single element conflict	Smallest impossible input
Single valid element	Smallest valid input
Dominant value solvable	Tests helper selection
Multiple helpers needed	Verifies repeated balancing logic
No conflicts	Confirms no unnecessary work
Large repeated values impossible	Tests extreme dominance
Mixed frequencies	Validates general correctness

Edge Cases

One important edge case occurs when the arrays already satisfy the requirement. In this situation, there are no conflicting indices, so no swaps are needed. A buggy implementation might still attempt unnecessary processing or produce a nonzero answer. This implementation correctly returns 0 because the initial conflict count remains zero.

Another critical edge case happens when one value dominates the conflicting positions too heavily. For example:

nums1 = [2,2,2]
nums2 = [2,2,2]

Even after arbitrary swaps, every position still contains 2, making the problem impossible. The algorithm detects this through the dominance formula:

dominant_count * 2 > total_conflicts

and correctly returns -1 when insufficient helper indices exist.

A third subtle edge case involves helper index selection. A helper index must not contain the dominant value in either array. Otherwise, adding that index would not actually reduce the dominance problem. The implementation carefully checks:

nums1[i] != dominant_value
nums2[i] != dominant_value

before accepting a helper index, preventing invalid swaps and ensuring correctness.