LeetCode 2499 - Minimum Total Cost to Make Arrays Unequal
You are given two arrays, nums1 and nums2, both of length n. You are allowed to perform operations only on nums1. In a single operation, you may swap any two indices in nums1, and the cost of that operation is the sum of the two indices involved in the swap.
Difficulty: 🔴 Hard
Topics: Array, Hash Table, Greedy, Counting
Solution
Problem Understanding
You are given two arrays, nums1 and nums2, both of length n. You are allowed to perform operations only on nums1. In a single operation, you may swap any two indices in nums1, and the cost of that operation is the sum of the two indices involved in the swap.
The goal is to rearrange nums1 so that for every index i, the condition:
nums1[i] != nums2[i]
holds true.
Among all possible valid rearrangements, you must compute the minimum total swap cost. If no valid rearrangement exists, return -1.
The important detail is that swaps happen only inside nums1. The array nums2 never changes.
The constraints are large:
1 <= n <= 100000
This immediately rules out brute force permutation generation or graph search over all swap states. Any acceptable solution must run in approximately linear time.
The most important observation is that only indices where:
nums1[i] == nums2[i]
are problematic. These positions already violate the requirement and must be fixed.
Another subtle point is that some values may appear very frequently among the problematic positions. If one value dominates too many conflict positions, it may become impossible to rearrange the array without creating another conflict somewhere else.
Several edge cases are especially important:
- Arrays that already satisfy the condition should return
0. - Arrays where all values are identical may be impossible.
- Cases where one value dominates the conflict positions require additional helper indices.
- Very small arrays, especially size
1, can easily become impossible.
Approaches
Brute Force Approach
A brute force solution would attempt to generate all possible rearrangements of nums1 obtainable through swaps and check whether the resulting array satisfies:
nums1[i] != nums2[i]
for every index.
For each valid arrangement, we could compute the minimum swap cost required to transform the original array into that arrangement.
This approach is theoretically correct because it explores every possible final configuration. However, the number of permutations grows factorially:
O(n!)
Even for relatively small n, this becomes computationally impossible. Since the constraint allows up to 100000 elements, brute force is entirely infeasible.
Key Insight Behind the Optimal Solution
The crucial observation is that we only care about indices where:
nums1[i] == nums2[i]
These indices are already invalid and must participate in swaps.
Suppose we collect all such indices into a set of "bad positions". If we swap values among these bad positions carefully, we can often resolve all conflicts.
However, another issue appears when one value occurs too frequently among the bad positions. Let:
totalBad= number of bad positionsmaxFreq= highest frequency of any value among bad positions
If:
maxFreq * 2 > totalBad
then the dominant value cannot be distributed safely among only the bad positions. We must recruit additional indices from outside the bad set.
Those extra indices must satisfy:
nums1[i] != nums2[i]
already, and also:
nums1[i] != dominantValue
nums2[i] != dominantValue
so they can safely absorb the excess dominant occurrences.
The greedy strategy is to always use the smallest possible indices, because swap cost depends directly on index sums.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n!) | O(n!) | Explores all permutations/swaps |
| Optimal | O(n) | O(n) | Greedy counting with frequency tracking |
Algorithm Walkthrough
Step 1: Identify conflicting indices
Traverse the arrays and collect every index i where:
nums1[i] == nums2[i]
These are the positions that currently violate the requirement.
For each such index:
- Add its index to the total cost.
- Count the frequency of its value.
- Track which value appears most frequently.
We maintain:
total_swaps_needed, the number of conflicting indicescost, initialized as the sum of conflicting indicesfrequency[value]dominant_valuedominant_count
Step 2: Check whether the dominant value is problematic
If the dominant value appears too many times among the conflicting positions, swaps only within the conflicting set may still leave collisions.
Specifically, if:
dominant_count * 2 > total_swaps_needed
then we need extra helper indices.
The number of additional indices required is:
required = dominant_count * 2 - total_swaps_needed
Step 3: Find helper indices
Traverse all indices again.
We want indices satisfying:
nums1[i] != nums2[i]
nums1[i] != dominant_value
nums2[i] != dominant_value
These indices are already valid, and neither side contains the dominant value. They can safely participate in swaps.
For each helper index:
- Add its index to the total cost
- Decrease
required
We greedily take indices in increasing order, minimizing total cost.
Step 4: Verify feasibility
If after scanning all indices:
required > 0
then it is impossible to eliminate all conflicts.
Return -1.
Otherwise, return the accumulated cost.
Why it works
The algorithm works because every conflicting index must participate in at least one swap. Therefore, their indices must contribute to the final cost.
When one value dominates the conflicting set, rearranging only inside that set becomes impossible due to pigeonhole constraints. Additional helper indices are mathematically necessary to separate repeated dominant values.
The greedy choice of using the smallest valid helper indices minimizes the additional cost while preserving correctness.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def minimumTotalCost(self, nums1: List[int], nums2: List[int]) -> int:
frequency = defaultdict(int)
total_conflicts = 0
total_cost = 0
dominant_value = -1
dominant_count = 0
n = len(nums1)
for i in range(n):
if nums1[i] == nums2[i]:
value = nums1[i]
total_conflicts += 1
total_cost += i
frequency[value] += 1
if frequency[value] > dominant_count:
dominant_count = frequency[value]
dominant_value = value
required = dominant_count * 2 - total_conflicts
for i in range(n):
if required <= 0:
break
if nums1[i] != nums2[i]:
if nums1[i] != dominant_value and nums2[i] != dominant_value:
total_cost += i
required -= 1
if required > 0:
return -1
return total_cost
The implementation begins by scanning all indices and identifying conflicts where nums1[i] == nums2[i]. Each conflicting index must eventually participate in a swap, so its index is immediately added to the total cost.
At the same time, we maintain a frequency map of values appearing in conflicting positions. This allows us to determine the dominant value, meaning the value that appears most frequently among conflicts.
After the first pass, we compute how many additional helper indices are required. If the dominant value is not overly frequent, no extra work is needed.
If helper indices are necessary, the second pass searches for indices already satisfying the inequality condition and not containing the dominant value on either side. These indices safely absorb excess dominant occurrences.
Finally, if enough helper indices cannot be found, the configuration is impossible and we return -1.
Go Solution
package main
func minimumTotalCost(nums1 []int, nums2 []int) int64 {
frequency := make(map[int]int)
totalConflicts := 0
var totalCost int64 = 0
dominantValue := -1
dominantCount := 0
n := len(nums1)
for i := 0; i < n; i++ {
if nums1[i] == nums2[i] {
value := nums1[i]
totalConflicts++
totalCost += int64(i)
frequency[value]++
if frequency[value] > dominantCount {
dominantCount = frequency[value]
dominantValue = value
}
}
}
required := dominantCount*2 - totalConflicts
for i := 0; i < n && required > 0; i++ {
if nums1[i] != nums2[i] &&
nums1[i] != dominantValue &&
nums2[i] != dominantValue {
totalCost += int64(i)
required--
}
}
if required > 0 {
return -1
}
return totalCost
}
The Go implementation follows the same logic as the Python solution. A map[int]int replaces Python's defaultdict.
One important difference is that the return type is int64, because the sum of indices can exceed the range of a 32 bit integer when n is large.
The algorithm itself remains identical.
Worked Examples
Example 1
nums1 = [1,2,3,4,5]
nums2 = [1,2,3,4,5]
All positions conflict.
| Index | nums1[i] | nums2[i] | Conflict | Frequency |
|---|---|---|---|---|
| 0 | 1 | 1 | Yes | 1:1 |
| 1 | 2 | 2 | Yes | 2:1 |
| 2 | 3 | 3 | Yes | 3:1 |
| 3 | 4 | 4 | Yes | 4:1 |
| 4 | 5 | 5 | Yes | 5:1 |
After the first pass:
total_conflicts = 5
total_cost = 0 + 1 + 2 + 3 + 4 = 10
dominant_count = 1
Compute:
required = 1 * 2 - 5 = -3
No helper indices are needed.
Final answer:
10
Example 2
nums1 = [2,2,2,1,3]
nums2 = [1,2,2,3,3]
Conflicting indices:
| Index | nums1[i] | nums2[i] |
|---|---|---|
| 1 | 2 | 2 |
| 2 | 2 | 2 |
| 4 | 3 | 3 |
Initial state:
total_conflicts = 3
total_cost = 1 + 2 + 4 = 7
frequency[2] = 2
frequency[3] = 1
dominant_value = 2
dominant_count = 2
Compute:
required = 2 * 2 - 3 = 1
We need one helper index.
Scan non-conflicting indices:
| Index | nums1[i] | nums2[i] | Valid Helper |
|---|---|---|---|
| 0 | 2 | 1 | No |
| 3 | 1 | 3 | Yes |
Take index 3.
Updated cost:
7 + 3 = 10
Final answer:
10
Example 3
nums1 = [1,2,2]
nums2 = [1,2,2]
All positions conflict.
| Value | Frequency |
|---|---|
| 1 | 1 |
| 2 | 2 |
State:
total_conflicts = 3
total_cost = 0 + 1 + 2 = 3
dominant_count = 2
Compute:
required = 2 * 2 - 3 = 1
No non-conflicting indices exist, so we cannot find a helper index.
Therefore:
return -1
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Two linear scans through the arrays |
| Space | O(n) | Frequency hash map may store up to n values |
The algorithm performs only two passes over the input arrays. Each operation inside the loops is constant time on average due to hash map access.
The extra memory comes from storing frequencies of conflicting values.
Test Cases
sol = Solution()
# Provided examples
assert sol.minimumTotalCost([1,2,3,4,5], [1,2,3,4,5]) == 10 # all positions conflict
assert sol.minimumTotalCost([2,2,2,1,3], [1,2,2,3,3]) == 10 # requires helper index
assert sol.minimumTotalCost([1,2,2], [1,2,2]) == -1 # impossible case
# Already valid arrays
assert sol.minimumTotalCost([1,2,3], [3,1,2]) == 0 # no conflicts initially
# Single element impossible
assert sol.minimumTotalCost([1], [1]) == -1 # cannot fix single conflict
# Single element already valid
assert sol.minimumTotalCost([1], [2]) == 0 # already satisfies condition
# Dominant value but solvable
assert sol.minimumTotalCost([1,1,1,2], [1,1,2,3]) == 1 # one helper needed
# Multiple helper indices needed
assert sol.minimumTotalCost([2,2,2,2,1,3], [2,2,2,1,2,4]) == 10 # several helpers
# No conflicts
assert sol.minimumTotalCost([4,5,6], [1,2,3]) == 0 # no swaps required
# Large repeated values impossible
assert sol.minimumTotalCost([7,7,7,7], [7,7,7,7]) == -1 # impossible dominance
# Mixed frequencies
assert sol.minimumTotalCost([1,2,1,2,3], [1,3,1,4,5]) == 2 # minimal helper usage
Test Summary
| Test | Why |
|---|---|
| All positions conflict | Validates base conflict handling |
| Requires helper index | Tests dominant value balancing |
| Impossible repeated value case | Ensures correct -1 detection |
| Already valid arrays | Confirms zero-cost handling |
| Single element conflict | Smallest impossible input |
| Single valid element | Smallest valid input |
| Dominant value solvable | Tests helper selection |
| Multiple helpers needed | Verifies repeated balancing logic |
| No conflicts | Confirms no unnecessary work |
| Large repeated values impossible | Tests extreme dominance |
| Mixed frequencies | Validates general correctness |
Edge Cases
One important edge case occurs when the arrays already satisfy the requirement. In this situation, there are no conflicting indices, so no swaps are needed. A buggy implementation might still attempt unnecessary processing or produce a nonzero answer. This implementation correctly returns 0 because the initial conflict count remains zero.
Another critical edge case happens when one value dominates the conflicting positions too heavily. For example:
nums1 = [2,2,2]
nums2 = [2,2,2]
Even after arbitrary swaps, every position still contains 2, making the problem impossible. The algorithm detects this through the dominance formula:
dominant_count * 2 > total_conflicts
and correctly returns -1 when insufficient helper indices exist.
A third subtle edge case involves helper index selection. A helper index must not contain the dominant value in either array. Otherwise, adding that index would not actually reduce the dominance problem. The implementation carefully checks:
nums1[i] != dominant_value
nums2[i] != dominant_value
before accepting a helper index, preventing invalid swaps and ensuring correctness.