LeetCode 2547 - Minimum Cost to Split an Array
The problem asks us to split the array nums into one or more contiguous non-empty subarrays such that the total cost is minimized. For every subarray, we define a special quantity called its importance value.
Difficulty: 🔴 Hard
Topics: Array, Hash Table, Dynamic Programming, Counting
Solution
Problem Understanding
The problem asks us to split the array nums into one or more contiguous non-empty subarrays such that the total cost is minimized.
For every subarray, we define a special quantity called its importance value. To compute it, we first construct the trimmed version of the subarray by removing every element that appears exactly once. Only duplicated values remain.
For example:
[1,2,3]becomes[]because every value appears once.[1,2,1,3,3]becomes[1,1,3,3]because1and3appear multiple times.[4,4,4]remains[4,4,4].
The importance value of a subarray is:
$$k + \text{length of trimmed(subarray)}$$
The final answer is the minimum possible sum of importance values across all chosen subarrays.
The constraints are important:
nums.length <= 1000nums[i] < nums.length
An array size of 1000 is too large for exponential partition enumeration, because there are $2^{n-1}$ possible ways to split an array. However, it is small enough for quadratic dynamic programming solutions.
The value range of nums[i] is also useful. Since values are bounded by n, we can use arrays instead of hash maps for frequency counting if desired.
Several edge cases are especially important:
- Arrays where all values are unique. In this case every trimmed length is zero, so the best strategy may involve many small subarrays or one large subarray depending on
k. - Arrays where every value repeats many times. Then trimmed lengths become large, and splitting strategically can reduce duplicate penalties.
- Very large
k. Since every split adds another fixedk, fewer subarrays become preferable. - Very small
k. Splitting aggressively may become optimal because reducing duplicate counts matters more than the fixed partition cost. - Single-element arrays. These always have trimmed length zero.
Approaches
Brute Force Approach
The brute force solution tries every possible way to partition the array.
For an array of length n, there are n - 1 possible cut positions. Every position can either contain a split or not contain a split, so the total number of partitions is:
$$2^{n-1}$$
For each partition, we compute the importance value of every subarray by counting frequencies and determining how many elements belong to duplicated values.
This approach is correct because it exhaustively evaluates all valid partitions and selects the minimum cost.
Unfortunately, it is far too slow. Even for n = 30, the number of partitions becomes enormous. With n = 1000, exhaustive search is completely infeasible.
Optimal Dynamic Programming Approach
The key observation is that the problem has optimal substructure.
Suppose we know the minimum cost for splitting the prefix nums[0:i]. Then, for every possible last subarray nums[j:i], we can compute:
$$dp[i] = \min(dp[j] + \text{cost}(j,i))$$
where:
dp[i]represents the minimum cost to split the firstielements.cost(j,i)is the importance value of subarraynums[j:i].
The challenge is computing cost(j,i) efficiently.
For a subarray, the trimmed length counts all elements belonging to values that appear at least twice.
We can maintain frequencies incrementally while expanding the subarray backward from position i.
When a number appears:
- For the first time, it contributes nothing.
- For the second time, both copies now belong in the trimmed array, so contribution increases by
2. - For the third or later time, each additional occurrence contributes
1.
This lets us compute trimmed lengths in constant time while extending the subarray.
The final complexity becomes quadratic, which is acceptable for n <= 1000.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | $O(2^n \cdot n)$ | $O(n)$ | Enumerates all partitions |
| Optimal Dynamic Programming | $O(n^2)$ | $O(n)$ | Builds minimum costs incrementally |
Algorithm Walkthrough
- Create a DP array
dpof lengthn + 1.
Here, dp[i] stores the minimum cost to split the first i elements of nums.
Initialize:
dp[0] = 0
because splitting an empty prefix costs nothing.
2. Initialize every other value in dp to infinity.
This allows us to minimize over candidate transitions.
3. Iterate through every ending position i from 1 to n.
We will compute the best partition ending at index i - 1.
4. For each i, expand the last subarray backward.
We consider every possible starting index j for the final subarray nums[j:i].
5. Maintain a frequency counter while moving backward.
This lets us update the trimmed length incrementally instead of recomputing frequencies from scratch. 6. Track the current duplicate contribution.
Let trimmed_length represent the size of the trimmed array.
When processing value x:
- If frequency becomes
1, contribution stays unchanged. - If frequency becomes
2, contribution increases by2. - If frequency becomes
3or more, contribution increases by1.
- Compute the subarray importance value.
The current subarray cost is:
$$k + \text{trimmed_length}$$ 8. Update the DP transition.
For every possible j:
$$dp[i] = \min(dp[i], dp[j] + k + \text{trimmed_length})$$
9. After processing all positions, return dp[n].
Why it works
The algorithm works because every optimal partition has a well-defined final subarray. If the last subarray starts at position j, then everything before j must itself be optimally partitioned. Otherwise, replacing it with a cheaper partition would improve the total solution.
Therefore:
$$dp[i] = \min_{0 \le j < i}(dp[j] + \text{cost}(j,i))$$
correctly captures the optimal answer for every prefix.
The frequency counting logic also correctly computes trimmed lengths because it exactly tracks how many elements belong to values appearing at least twice.
Python Solution
from typing import List
class Solution:
def minCost(self, nums: List[int], k: int) -> int:
n = len(nums)
dp = [float('inf')] * (n + 1)
dp[0] = 0
for i in range(1, n + 1):
freq = [0] * n
trimmed_length = 0
for j in range(i - 1, -1, -1):
value = nums[j]
freq[value] += 1
if freq[value] == 2:
trimmed_length += 2
elif freq[value] > 2:
trimmed_length += 1
cost = k + trimmed_length
dp[i] = min(dp[i], dp[j] + cost)
return dp[n]
The implementation directly follows the dynamic programming formulation.
The dp array stores the minimum cost for each prefix length. For every ending position i, we examine all possible starting positions j for the last subarray.
The inner loop moves backward from i - 1 to 0. During this traversal, we maintain frequency counts for the current subarray.
The trimmed_length variable is updated incrementally:
- The second occurrence of a value adds
2because both copies now belong in the trimmed array. - Every occurrence after that adds
1.
This avoids recomputing trimmed arrays repeatedly.
At every step, we compute the cost of the current subarray and update the DP state.
Go Solution
package main
import "math"
func minCost(nums []int, k int) int {
n := len(nums)
dp := make([]int, n+1)
for i := 1; i <= n; i++ {
dp[i] = math.MaxInt32
}
for i := 1; i <= n; i++ {
freq := make([]int, n)
trimmedLength := 0
for j := i - 1; j >= 0; j-- {
value := nums[j]
freq[value]++
if freq[value] == 2 {
trimmedLength += 2
} else if freq[value] > 2 {
trimmedLength++
}
cost := k + trimmedLength
if dp[j]+cost < dp[i] {
dp[i] = dp[j] + cost
}
}
}
return dp[n]
}
The Go implementation mirrors the Python logic closely.
Instead of Python lists initialized with infinity, Go uses math.MaxInt32 as a large sentinel value.
Frequency tracking uses a slice of integers because the constraints guarantee:
$$0 \le nums[i] < n$$
so direct indexing is safe and efficient.
Go slices are zero-initialized automatically, which simplifies setup.
Worked Examples
Example 1
nums = [1,2,1,2,1,3,3]
k = 2
We compute DP incrementally.
Initial state:
| i | dp[i] |
|---|---|
| 0 | 0 |
Processing i = 1
Subarray considered:
| Subarray | Trimmed Length | Cost | dp[1] |
|---|---|---|---|
| [1] | 0 | 2 | 2 |
So:
dp[1] = 2
Processing i = 2
| Subarray | Trimmed Length | Cost | Candidate |
|---|---|---|---|
| [2] | 0 | 2 | dp[1] + 2 = 4 |
| [1,2] | 0 | 2 | dp[0] + 2 = 2 |
Result:
dp[2] = 2
Processing i = 3
| Subarray | Trimmed Length | Cost | Candidate |
|---|---|---|---|
| [1] | 0 | 2 | 4 |
| [2,1] | 0 | 2 | 4 |
| [1,2,1] | 2 | 4 | 4 |
Result:
dp[3] = 4
Processing i = 7
Eventually the optimal partition becomes:
[1,2] + [1,2,1,3,3]
Costs:
| Subarray | Trimmed | Importance |
|---|---|---|
| [1,2] | [] | 2 |
| [1,2,1,3,3] | [1,1,3,3] | 6 |
Total:
2 + 6 = 8
Final answer:
8
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | $O(n^2)$ | Every pair (j, i) is processed once |
| Space | $O(n)$ | DP array plus frequency array |
The outer loop runs n times, and the inner loop also runs at most n times. Each iteration performs only constant-time work because frequency updates and trimmed-length updates are incremental.
The DP array requires O(n) space. The frequency array also requires O(n) space because values are bounded by n.
Test Cases
from typing import List
class Solution:
def minCost(self, nums: List[int], k: int) -> int:
n = len(nums)
dp = [float('inf')] * (n + 1)
dp[0] = 0
for i in range(1, n + 1):
freq = [0] * n
trimmed_length = 0
for j in range(i - 1, -1, -1):
value = nums[j]
freq[value] += 1
if freq[value] == 2:
trimmed_length += 2
elif freq[value] > 2:
trimmed_length += 1
dp[i] = min(dp[i], dp[j] + k + trimmed_length)
return dp[n]
sol = Solution()
assert sol.minCost([1,2,1,2,1,3,3], 2) == 8 # provided example 1
assert sol.minCost([1,2,1,2,1], 2) == 6 # provided example 2
assert sol.minCost([1,2,1,2,1], 5) == 10 # provided example 3
assert sol.minCost([1], 10) == 10 # single element
assert sol.minCost([1,2,3,4], 1) == 1 # all unique, one segment best
assert sol.minCost([1,1,1,1], 2) == 6 # repeated values
assert sol.minCost([0,0,0], 1) == 4 # duplicates only
assert sol.minCost([1,2,3,1,2,3], 2) == 8 # multiple repeated groups
assert sol.minCost([1,2,3,4,5], 100) == 100 # huge k discourages splitting
assert sol.minCost([1,1,2,2,3,3], 1) == 7 # several duplicate pairs
| Test | Why |
|---|---|
[1,2,1,2,1,3,3], k=2 |
Validates provided example |
[1,2,1,2,1], k=2 |
Checks optimal splitting |
[1,2,1,2,1], k=5 |
Large fixed partition cost |
[1], k=10 |
Single-element edge case |
[1,2,3,4], k=1 |
All unique elements |
[1,1,1,1], k=2 |
Heavy duplication |
[0,0,0], k=1 |
Small repeated array |
[1,2,3,1,2,3], k=2 |
Multiple duplicate groups |
[1,2,3,4,5], k=100 |
Very large k |
[1,1,2,2,3,3], k=1 |
Multiple duplicate pairs |
Edge Cases
One important edge case occurs when all elements are unique. In this situation, every subarray has trimmed length zero because no value appears more than once. A buggy implementation might incorrectly count frequencies or accidentally include unique elements in the trimmed length. The current implementation handles this correctly because values only contribute when their frequency reaches at least two.
Another important case is when every element is identical. For example, [1,1,1,1] causes the trimmed length to grow rapidly. Incorrect implementations often mishandle the transition from frequency 1 to frequency 2. The algorithm explicitly adds 2 when a value reaches frequency 2, ensuring both copies are counted correctly.
Large k values are also critical. When k is huge, splitting becomes very expensive, so the optimal answer often uses a single subarray even if duplicates exist. Greedy strategies that always try to reduce duplicates would fail here. The DP solution naturally evaluates all partition possibilities and correctly balances split cost against duplicate penalties.
A final subtle edge case involves arrays of length one. Since the only subarray contains a single unique value, the trimmed length is zero and the answer is exactly k. The implementation handles this automatically because the frequency never exceeds one.