LeetCode 805 - Split Array With Same Average

Difficulty: 🔴 Hard
Topics: Array, Hash Table, Math, Dynamic Programming, Bit Manipulation, Bitmask

Solution

Problem Understanding

The problem asks whether we can divide the given array nums into two non-empty groups such that both groups have the same average.

Suppose the array is split into subsets A and B. Let:

sum(A) be the sum of elements in A
len(A) be the number of elements in A
sum(B) and len(B) be defined similarly

The condition is:

$$\frac{sum(A)}{len(A)} = \frac{sum(B)}{len(B)}$$

Because A and B together contain every element of the original array, we also know:

$$sum(A) + sum(B) = totalSum$$

and

$$len(A) + len(B) = n$$

Using these relationships, the condition can be rewritten as:

$$\frac{sum(A)}{len(A)} = \frac{totalSum}{n}$$

This is the key mathematical simplification. Instead of trying to construct two subsets with equal averages directly, we only need to determine whether there exists a non-empty subset whose average equals the average of the entire array.

If such a subset exists, then the remaining elements automatically form another subset with the same average.

The input size is small enough to allow exponential-style techniques with optimization:

1 <= nums.length <= 30
0 <= nums[i] <= 10^4

A length of 30 is too large for brute force enumeration of all subsets, since there are:

$$2^{30} \approx 10^9$$

possible subsets.

However, 30 is small enough for meet-in-the-middle strategies, subset DP, or bitmask-based optimizations.

Several important edge cases appear immediately:

Arrays with only two elements, where the split possibilities are extremely limited.
Arrays where no subset size can produce an integer target sum.
Arrays with repeated values.
Arrays with all identical values, where any non-empty split works.
Arrays where the total sum is not divisible appropriately for any subset size.

The implementation must carefully avoid floating point precision issues. Average comparisons should always be transformed into integer equations.

Approaches

Brute Force Approach

The most direct solution is to enumerate every possible subset of the array.

For each subset:

Compute its sum and size.
Compute the complement subset's sum and size.
Compare averages.

A subset is valid if:

$$sum(subset) \times (n - size) = (totalSum - sum(subset)) \times size$$

This guarantees exact integer arithmetic without floating point errors.

This approach is correct because every possible partition is examined. If a valid split exists, brute force eventually finds it.

The problem is that the number of subsets is exponential:

$$2^n$$

With n = 30, this becomes roughly one billion subsets, which is far too slow.

Optimal Approach, Dynamic Programming on Subset Sizes and Sums

The crucial observation is that equal averages imply:

$$\frac{subsetSum}{subsetSize} = \frac{totalSum}{n}$$

Rearranging gives:

$$subsetSum = \frac{totalSum \times subsetSize}{n}$$

For a subset of size k, the required sum is therefore fixed:

$$target = \frac{totalSum \times k}{n}$$

This leads to an important pruning rule:

A subset of size k is only possible if:

$$(totalSum \times k) \bmod n = 0$$

Otherwise, the target sum is fractional and impossible.

The problem now becomes:

Can we find a subset of size k whose sum equals the required target?

We solve this using dynamic programming.

Define:

dp[k] as the set of all sums achievable using exactly k elements.

Initially:

dp[0] = {0}

Then for each number, update the DP in reverse order of subset size to avoid reusing elements multiple times.

If at any point the required target sum appears in dp[k], we return true.

This dramatically reduces the search space and works efficiently within the constraints.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(2^n * n)	O(1)	Enumerates every subset explicitly
Optimal DP	O(n^2 * S) in practice	O(n * S)	Tracks achievable sums by subset size

Here, S represents the range of reachable subset sums.

Algorithm Walkthrough

Compute the total sum of the array and its length.

Let:

n = len(nums)
totalSum = sum(nums)

These values are used throughout the algorithm. 2. Determine which subset sizes are mathematically possible.

For every subset size k from 1 to n // 2, compute:

$$totalSum \times k$$

If this value is divisible by n, then a subset of size k could potentially have the same average as the full array.

Otherwise, skip it entirely. 3. Initialize the dynamic programming structure.

Create:

dp[k]

where dp[k] stores all sums achievable using exactly k elements.

Start with:

dp[0] = {0}

because a subset of size zero has sum zero. 4. Process each number in the array.

For every number num, update subset sizes in reverse order:

for k from maxSize down to 1

Reverse iteration is essential because it prevents the same element from being used multiple times during a single update pass. 5. Generate new sums.

For every existing sum prevSum in dp[k - 1], add:

prevSum + num

into dp[k].

This represents choosing the current element. 6. Check for valid target sums.

For every valid subset size k, compute:

$$target = \frac{totalSum \times k}{n}$$

If target exists in dp[k], return true. 7. If no valid subset is found after processing all numbers, return false.

Why it works

The DP maintains a complete record of every sum achievable with exactly k elements.

For any subset size k, the equal-average condition uniquely determines the required sum:

$$subsetSum = \frac{totalSum \times k}{n}$$

If this target sum appears in dp[k], then there exists a subset whose average equals the overall array average. The remaining elements automatically form the complementary subset with the same average.

Because every achievable (subsetSize, subsetSum) combination is explored exactly once, the algorithm is correct.

Python Solution

from typing import List

class Solution:
    def splitArraySameAverage(self, nums: List[int]) -> bool:
        n = len(nums)
        total_sum = sum(nums)

        possible = False

        for k in range(1, n // 2 + 1):
            if (total_sum * k) % n == 0:
                possible = True
                break

        if not possible:
            return False

        dp = [set() for _ in range(n + 1)]
        dp[0].add(0)

        for num in nums:
            for size in range(n // 2, 0, -1):
                for prev_sum in list(dp[size - 1]):
                    dp[size].add(prev_sum + num)

        for k in range(1, n // 2 + 1):
            if (total_sum * k) % n != 0:
                continue

            target = (total_sum * k) // n

            if target in dp[k]:
                return True

        return False

The implementation begins by computing the array length and total sum. Before building the DP table, it performs a mathematical feasibility check. If no subset size can produce an integer target sum, the answer is immediately False.

The DP array is then initialized. Each index represents a subset size, and the corresponding set stores all reachable sums for that size.

The algorithm processes each number one at a time. Reverse iteration over subset sizes is critical because it prevents the current number from being counted multiple times in the same iteration.

Finally, the algorithm checks whether any valid subset size contains its required target sum. If so, a valid split exists.

Go Solution

package main

func splitArraySameAverage(nums []int) bool {
	n := len(nums)

	totalSum := 0
	for _, num := range nums {
		totalSum += num
	}

	possible := false

	for k := 1; k <= n/2; k++ {
		if (totalSum*k)%n == 0 {
			possible = true
			break
		}
	}

	if !possible {
		return false
	}

	dp := make([]map[int]bool, n+1)

	for i := 0; i <= n; i++ {
		dp[i] = make(map[int]bool)
	}

	dp[0][0] = true

	for _, num := range nums {
		for size := n / 2; size >= 1; size-- {
			current := make([]int, 0)

			for prevSum := range dp[size-1] {
				current = append(current, prevSum+num)
			}

			for _, newSum := range current {
				dp[size][newSum] = true
			}
		}
	}

	for k := 1; k <= n/2; k++ {
		if (totalSum*k)%n != 0 {
			continue
		}

		target := (totalSum * k) / n

		if dp[k][target] {
			return true
		}
	}

	return false
}

The Go implementation follows the same logic as the Python version, but Go does not provide a built-in set type. Instead, maps of type map[int]bool are used to represent sets of sums.

Another important difference is that Go maps should not be modified while iterating over them in a way that depends on newly inserted values. To avoid accidental reuse of elements during updates, intermediate sums are collected into a temporary slice before insertion.

Integer overflow is not a concern here because the constraints are small enough for standard Go integers.

Worked Examples

Example 1

nums = [1,2,3,4,5,6,7,8]

Step 1: Initial Values

Variable	Value
n	8
totalSum	36
average	4.5

Step 2: Check Valid Subset Sizes

We test subset sizes from 1 to 4.

k	totalSum * k	Divisible by n?	Target Sum
1	36	No	-
2	72	Yes	9
3	108	No	-
4	144	Yes	18

So we only need to find:

2 elements summing to 9
or 4 elements summing to 18

Step 3: DP Construction

Initially:

Subset Size	Reachable Sums
0	{0}

After processing 1:

Subset Size	Reachable Sums
1	{1}

After processing 2:

Subset Size	Reachable Sums
1	{1,2}
2	{3}

After processing 3:

Subset Size	Reachable Sums
1	{1,2,3}
2	{3,4,5}
3	{6}

Continue processing numbers.

Eventually:

Subset Size	Contains Target?
2	Yes, sum 9 exists
4	Yes, sum 18 exists

For example:

[1,8]

has:

sum = 9
size = 2
average = 4.5

Thus the algorithm returns True.

Example 2

nums = [3,1]

Step 1: Initial Values

Variable	Value
n	2
totalSum	4

Step 2: Check Valid Subset Sizes

Only k = 1 is possible.

k	totalSum * k	Divisible by n?
1	4	Yes

Target sum:

$$4 / 2 = 2$$

We need one element equal to 2.

Step 3: DP

Reachable sums of size 1:

{3,1}

Target 2 does not exist.

Return False.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n^2 * S)	Each number updates subset sums for multiple subset sizes
Space	O(n * S)	DP stores reachable sums for each subset size

Here, S is the number of distinct reachable sums.

The practical performance is much better than brute force because many subset sums overlap and are merged inside the DP sets. With n <= 30, this solution comfortably fits within LeetCode limits.

Test Cases

from typing import List

class Solution:
    def splitArraySameAverage(self, nums: List[int]) -> bool:
        n = len(nums)
        total_sum = sum(nums)

        possible = False

        for k in range(1, n // 2 + 1):
            if (total_sum * k) % n == 0:
                possible = True
                break

        if not possible:
            return False

        dp = [set() for _ in range(n + 1)]
        dp[0].add(0)

        for num in nums:
            for size in range(n // 2, 0, -1):
                for prev_sum in list(dp[size - 1]):
                    dp[size].add(prev_sum + num)

        for k in range(1, n // 2 + 1):
            if (total_sum * k) % n != 0:
                continue

            target = (total_sum * k) // n

            if target in dp[k]:
                return True

        return False

sol = Solution()

assert sol.splitArraySameAverage([1,2,3,4,5,6,7,8]) == True  # provided example
assert sol.splitArraySameAverage([3,1]) == False  # provided example

assert sol.splitArraySameAverage([5,5,5,5]) == True  # identical values
assert sol.splitArraySameAverage([0,0,0,0]) == True  # all zeros
assert sol.splitArraySameAverage([1,1]) == True  # equal pair

assert sol.splitArraySameAverage([1,2]) == False  # impossible small case
assert sol.splitArraySameAverage([2,4,6,8]) == False  # no valid subset

assert sol.splitArraySameAverage([3,3,3,3,6,6,6,6]) == True  # repeated values
assert sol.splitArraySameAverage([18,10,5,3]) == False  # uneven distribution

assert sol.splitArraySameAverage([2,0,5,6,16,12,15,6,8,6,2,1,2,10,11,4,15,6]) == True  # larger stress case

Test Summary

Test	Why
`[1,2,3,4,5,6,7,8]`	Standard positive example
`[3,1]`	Standard negative example
`[5,5,5,5]`	All identical values
`[0,0,0,0]`	Zero-sum edge case
`[1,1]`	Small valid split
`[1,2]`	Small impossible split
`[2,4,6,8]`	No matching subset average
`[3,3,3,3,6,6,6,6]`	Repeated values
`[18,10,5,3]`	Uneven sums
Large mixed array	Stress test for DP

Edge Cases

One important edge case occurs when the array length is extremely small, especially n = 2. In such cases, there are very few possible partitions. A naive implementation might incorrectly assume every two-element array can be split evenly. The algorithm correctly handles this by checking whether a subset of size one can achieve the required target sum.

Another important case is arrays where no subset size can mathematically produce an integer target sum. For example, if:

$$(totalSum \times k) \bmod n \neq 0$$

then the required subset sum would not be an integer. Many inefficient solutions still attempt DP exploration in these cases. This implementation performs an early pruning step and immediately returns False if no valid subset size exists.

Arrays with repeated values can also create subtle bugs in subset generation logic. If updates are performed in forward order, the same element may accidentally be reused multiple times in one iteration. The implementation avoids this issue by iterating subset sizes in reverse order, ensuring each number contributes at most once per subset construction.

Finally, arrays containing all identical values are special because every subset has the same average. Some implementations accidentally reject these because they incorrectly forbid certain subset sizes or mishandle complement subsets. This solution naturally handles them correctly because the required target sums are always reachable.