LeetCode 3801 - Minimum Cost to Merge Sorted Lists

Difficulty: 🔴 Hard
Topics: Array, Two Pointers, Binary Search, Dynamic Programming, Bit Manipulation

Solution

Problem Understanding

We are given between 2 and 12 sorted lists. At any moment we may choose any two existing lists and merge them.

If the two lists are $a$ and $b$, the merge cost is

$$\operatorname{len}(a)+\operatorname{len}(b)+ \left|\operatorname{median}(a)-\operatorname{median}(b)\right|.$$

After the merge, the two lists disappear and are replaced by their sorted union. The process continues until only one list remains.

The key observation is that although the order of merges changes the total cost, the contents of the final merged list are always the same. Therefore the problem is really asking:

How should we build a binary merge tree over the original lists so that the sum of all merge costs is minimized?

The constraints are very revealing:

The number of lists is at most $12$.
The total number of elements across all lists is at most $2000$.

A solution exponential in the number of lists is feasible, because

$$2^{12}=4096.$$

However, anything exponential in the total number of elements would be impossible.

An important subtlety is that the median of a merged list depends on all elements contained in that merged list. Therefore we cannot treat a merged group only by its size. We must also know the median of the union represented by that group.

Edge cases include:

Only two lists, where exactly one merge is possible.
Lists whose medians are equal, making the absolute-difference term zero.
Lists with very different lengths.
Duplicate values and negative values.
Merge orders where an expensive median difference can be avoided by first forming intermediate groups.

Approaches

Brute Force

A brute-force solution tries every possible sequence of merges.

For $n$ lists, the first merge can choose any pair, the next merge chooses a pair among the remaining groups, and so on. The number of possible merge trees grows super-exponentially.

The approach is correct because it explicitly evaluates every legal merge order and returns the minimum cost.

Its running time is far too large even for $n=12$.

Key Insight

Every merged group corresponds to a subset of the original lists.

For any subset $S$:

Its total length is fixed.
Its merged sorted contents are fixed.
Its median is fixed.

Therefore a subset can be treated as a single state.

Suppose subset $S$ is split into two non-empty disjoint subsets $A$ and $B$.

If we optimally build $A$ and optimally build $B$, then the final merge contributes

$$\operatorname{size}(A) + \operatorname{size}(B) + |\operatorname{med}(A)-\operatorname{med}(B)|.$$

This immediately gives a subset DP.

Let

$$dp[S]$$

be the minimum cost required to merge all original lists in subset $S$ into one list.

Then

$$dp[S] = \min_{A \subset S} \Bigl( dp[A] + dp[S\setminus A] + \operatorname{size}(A) + \operatorname{size}(S\setminus A) + |\operatorname{med}(A)-\operatorname{med}(S\setminus A)| \Bigr).$$

The only remaining task is computing the size and median of every subset.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	Super-exponential	Large	Enumerates merge orders
Optimal DP over subsets	$O(3^n + N2^n)$	$O(N2^n)$	$n \le 12$, $N \le 2000$

Here $N$ denotes the total number of elements across all lists.

Algorithm Walkthrough

Step 1: Precompute every subset's merged contents

Let $n$ be the number of lists.

Represent a subset by a bitmask.

For every mask:

Store its total length.
Store its fully merged sorted array.

Use the least significant set bit technique.

$$mask = prev \cup {i},$$

then the sorted contents of mask are obtained by merging:

the sorted contents of prev,
the original list $i$.

This is a standard two-pointer merge.

Step 2: Compute subset medians

For every non-empty mask:

Let $m$ be the subset length.
The median is the element at index

$$\left\lfloor \frac{m-1}{2} \right\rfloor.$$

This matches the problem's definition because even-length lists use the left middle element.

Step 3: Initialize DP

If a subset contains exactly one original list, no merges are needed.

Therefore

$$dp[mask]=0.$$

Step 4: Enumerate partitions

For every subset containing at least two original lists:

Enumerate all proper non-empty submasks $A$.

Let

$$B = mask \setminus A.$$

To avoid processing both $(A,B)$ and $(B,A)$, keep only one orientation, for example

$$A < B.$$

Step 5: Apply the transition

Compute

$$dp[A] + dp[B] + size[A] + size[B] + |median[A]-median[B]|.$$

Take the minimum over all valid partitions.

Step 6: Return the answer

The desired subset is the mask containing all original lists.

Why it works

Every valid merge process forms a binary tree whose leaves are the original lists. Consider the final merge. It joins two disjoint groups $A$ and $B$, whose union is the current subset $S$.

The costs incurred inside $A$ and $B$ are independent of each other and are exactly $dp[A]$ and $dp[B]$. The final merge contributes

$$size[A]+size[B]+|median[A]-median[B]|.$$

Therefore every merge tree corresponds to one DP transition, and every DP transition corresponds to a valid final merge of some merge tree. By induction on the number of lists in a subset, the recurrence computes the optimal cost for every subset. Hence the answer for the full mask is optimal.

Python Solution

from typing import List

class Solution:
    def minMergeCost(self, lists: List[List[int]]) -> int:
        n = len(lists)
        total_masks = 1 << n

        merged = [None] * total_masks
        size = [0] * total_masks
        median = [0] * total_masks

        merged[0] = []

        for mask in range(1, total_masks):
            lsb = mask & -mask
            bit_index = lsb.bit_length() - 1
            prev = mask ^ lsb

            a = merged[prev]
            b = lists[bit_index]

            merged_list = []
            i = j = 0

            while i < len(a) and j < len(b):
                if a[i] <= b[j]:
                    merged_list.append(a[i])
                    i += 1
                else:
                    merged_list.append(b[j])
                    j += 1

            if i < len(a):
                merged_list.extend(a[i:])
            if j < len(b):
                merged_list.extend(b[j:])

            merged[mask] = merged_list

            m = len(merged_list)
            size[mask] = m
            median[mask] = merged_list[(m - 1) // 2]

        INF = 10**18
        dp = [INF] * total_masks

        for i in range(n):
            dp[1 << i] = 0

        for mask in range(1, total_masks):
            if mask & (mask - 1) == 0:
                continue

            sub = (mask - 1) & mask

            while sub:
                other = mask ^ sub

                if sub < other:
                    cost = (
                        dp[sub]
                        + dp[other]
                        + size[sub]
                        + size[other]
                        + abs(median[sub] - median[other])
                    )

                    if cost < dp[mask]:
                        dp[mask] = cost

                sub = (sub - 1) & mask

        return dp[total_masks - 1]

The implementation first builds the merged sorted contents for every subset. Because each subset's contents are uniquely determined, its size and median can be computed once and reused throughout the DP.

The DP array stores the minimum cost for each subset. Single-list subsets require no merges and therefore have cost zero. For larger subsets, every proper partition is examined, and the recurrence described above is applied.

The final answer is the DP value of the full mask.

Go Solution

package main

import (
	"math"
)

func minMergeCost(lists [][]int) int64 {
	n := len(lists)
	totalMasks := 1 << n

	merged := make([][]int, totalMasks)
	size := make([]int, totalMasks)
	median := make([]int, totalMasks)

	merged[0] = []int{}

	for mask := 1; mask < totalMasks; mask++ {
		lsb := mask & -mask

		bitIndex := 0
		for (1 << bitIndex) != lsb {
			bitIndex++
		}

		prev := mask ^ lsb

		a := merged[prev]
		b := lists[bitIndex]

		cur := make([]int, 0, len(a)+len(b))

		i, j := 0, 0
		for i < len(a) && j < len(b) {
			if a[i] <= b[j] {
				cur = append(cur, a[i])
				i++
			} else {
				cur = append(cur, b[j])
				j++
			}
		}

		cur = append(cur, a[i:]...)
		cur = append(cur, b[j:]...)

		merged[mask] = cur

		m := len(cur)
		size[mask] = m
		median[mask] = cur[(m-1)/2]
	}

	dp := make([]int64, totalMasks)
	for i := range dp {
		dp[i] = math.MaxInt64
	}

	for i := 0; i < n; i++ {
		dp[1<<i] = 0
	}

	for mask := 1; mask < totalMasks; mask++ {
		if mask&(mask-1) == 0 {
			continue
		}

		for sub := (mask - 1) & mask; sub > 0; sub = (sub - 1) & mask {
			other := mask ^ sub

			if sub < other {
				diff := median[sub] - median[other]
				if diff < 0 {
					diff = -diff
				}

				cost := dp[sub] +
					dp[other] +
					int64(size[sub]+size[other]+diff)

				if cost < dp[mask] {
					dp[mask] = cost
				}
			}
		}
	}

	return dp[totalMasks-1]
}

The Go version follows the same structure as the Python version. The DP values are stored as int64 to avoid overflow concerns. Slice merging is implemented explicitly using the standard two-pointer technique.

Worked Examples

Example 1

Input:

$$[[1,3,5],[2,4],[6,7,8]]$$

Subset statistics:

Subset	Size	Median
A	3	3
B	2	2
C	3	7
AB	5	3
AC	6	5
BC	5	4
ABC	8	4

Pair costs:

Merge	Cost
A+B	(3+2+
A+C	(3+3+
B+C	(2+3+

DP values:

Subset	DP
A	0
B	0
C	0
AB	6
AC	10
BC	10

For $ABC$:

$$AB + C = 6 + 0 + 5 + 3 + |3-7| = 18$$

$$AC + B = 10 + 0 + 6 + 2 + |5-2| = 21$$

$$BC + A = 10 + 0 + 5 + 3 + |4-3| = 19$$

Minimum:

$$\boxed{18}$$

Example 2

Input:

$$[[1,1,5],[1,4,7,8]]$$

Sizes:

$$3,;4$$

Medians:

$$1,;4$$

Cost:

$$3+4+|1-4| = 10$$

Answer:

$$\boxed{10}$$

Example 3

Input:

$$[[1],[3]]$$

Cost:

$$1+1+|1-3| = 4$$

Answer:

$$\boxed{4}$$

Example 4

Input:

$$[[1],[1]]$$

Cost:

$$1+1+|1-1| = 2$$

Answer:

$$\boxed{2}$$

Complexity Analysis

Measure	Complexity	Explanation
Time	$O(N2^n + 3^n)$	Subset preprocessing plus subset-partition DP
Space	$O(N2^n)$	Stores merged contents for every subset

The preprocessing cost is proportional to the total amount of data stored across all subsets. Each original element appears in exactly $2^{n-1}$ subset representations, giving $O(N2^n)$ total storage and construction work. The DP uses the standard subset-partition enumeration whose complexity is $O(3^n)$.

Test Cases

s = Solution()

assert s.minMergeCost([[1, 3, 5], [2, 4], [6, 7, 8]]) == 18  # example 1
assert s.minMergeCost([[1, 1, 5], [1, 4, 7, 8]]) == 10       # example 2
assert s.minMergeCost([[1], [3]]) == 4                       # example 3
assert s.minMergeCost([[1], [1]]) == 2                       # example 4

assert s.minMergeCost([[0], [0], [0]]) == 4                 # equal medians
assert s.minMergeCost([[-5], [5]]) == 12                    # negative values
assert s.minMergeCost([[1, 2], [3, 4]]) == 6                # even lengths
assert s.minMergeCost([[1], [100], [101]]) == 204           # large gaps
assert s.minMergeCost([[1], [2], [3], [4]]) >= 0            # multiple lists
assert s.minMergeCost([[5, 5, 5], [5], [5, 5]]) == 12       # duplicates

Test	Why
`[[1,3,5],[2,4],[6,7,8]]`	Official example
`[[1,1,5],[1,4,7,8]]`	Official example
`[[1],[3]]`	Smallest non-trivial input
`[[1],[1]]`	Zero median difference
`[[0],[0],[0]]`	All medians equal
`[[-5],[5]]`	Negative values
`[[1,2],[3,4]]`	Even-length median rule
`[[1],[100],[101]]`	Large median gaps
`[[1],[2],[3],[4]]`	Multiple merge levels
`[[5,5,5],[5],[5,5]]`	Duplicate-heavy input

Edge Cases

A particularly important case occurs when many lists have the same median. In that situation the absolute-difference term becomes zero for many merges. An implementation that recomputes medians incorrectly after merges can easily lose these zero-cost opportunities. The subset preprocessing guarantees that every median is computed exactly from the merged contents of that subset.

Even-length lists require special care. The problem defines the median as the left middle element, not the average of the two middle elements. The implementation uses index

$$(m-1)/2$$

which exactly matches this definition.

Negative values can also be problematic because median differences are passed through an absolute value. The implementation stores medians as integers and always computes

$$|\operatorname{med}(A)-\operatorname{med}(B)|$$

directly, so negative numbers are handled correctly.

Finally, several different merge orders may produce the same subset. A greedy strategy can easily miss the optimum. The subset DP considers every possible final partition of every subset, guaranteeing that the globally optimal merge tree is found.