LeetCode 3900 - Longest Balanced Substring After One Swap

We are given a binary string s containing only '0' and '1'. A substring is considered balanced when it contains exactly the same number of zeros and ones. Before choosing the substring, we are allowed to perform at most one swap between any two positions in the entire string.

LeetCode Problem 3900

Difficulty: 🟡 Medium
Topics:

Solution

Problem Understanding

We are given a binary string s containing only '0' and '1'.

A substring is considered balanced when it contains exactly the same number of zeros and ones. Before choosing the substring, we are allowed to perform at most one swap between any two positions in the entire string. The swap is optional, so we may also choose not to swap anything.

The goal is to find the maximum possible length of a balanced substring that can exist after applying at most one swap.

A balanced substring must always have even length because it contains an equal number of zeros and ones. The input length can be as large as 10^5, which immediately rules out any algorithm that examines all substrings explicitly.

Several edge cases are important:

  • A string containing only one type of character, such as "111" or "00000", can never contain a non-empty balanced substring.
  • The optimal answer may come from a substring that is already balanced, meaning no swap is needed.
  • The optimal answer may require a swap involving a character outside the chosen substring.
  • Since only one swap is allowed, we cannot arbitrarily rearrange the string. The effect of a single swap is very limited and must be analyzed carefully.

Approaches

Brute Force

A direct approach would enumerate every possible swap, including the option of performing no swap. For each resulting string, we could enumerate every substring and check whether it is balanced.

There are O(n²) possible swaps and O(n²) possible substrings. Checking whether a substring is balanced takes additional time unless extensive preprocessing is used.

Even with optimizations, the overall complexity remains far beyond what is feasible for n = 10^5.

The brute force approach is correct because it explicitly tries every valid operation and every possible substring, but it is much too slow.

Key Observation

Let us encode:

  • '1' as +1
  • '0' as -1

Then a substring is balanced exactly when its sum is 0.

Now consider a substring whose current sum is d.

A single swap can only affect the balance of the substring if one swapped position lies inside the substring and the other lies outside. In that case:

  • Replacing an inside 0 with an outside 1 increases the sum by 2.
  • Replacing an inside 1 with an outside 0 decreases the sum by 2.

Therefore, one swap can only change the substring sum by ±2.

This means a substring can become balanced if and only if:

  • Its current sum is 0, or
  • Its current sum is 2 and we can decrease it by 2, or
  • Its current sum is -2 and we can increase it by 2.

No other sum can be fixed using a single swap.

The problem therefore reduces to finding the longest substring whose sum is:

  • 0
  • 2 with a valid corrective swap
  • -2 with a valid corrective swap

Using prefix sums and binary search, all of these can be found efficiently.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n⁴) or worse O(1) Enumerates swaps and substrings
Optimal O(n log n) O(n) Prefix sums plus binary search

Algorithm Walkthrough

  1. Convert the string conceptually into values where '1' = +1 and '0' = -1.
  2. Build a prefix sum array pref, where pref[i] represents the sum of the first i characters.
  3. Find the longest substring whose sum is 0.

For a substring (i, j], its sum equals:

pref[j] - pref[i]

Therefore the sum is zero when the two prefix sums are equal. For every prefix value, store its earliest occurrence and compute the maximum distance. 4. Count the total number of zeros and ones in the entire string. 5. Consider substrings whose sum is 2.

Such a substring has one extra '1' compared to '0'.

To fix it, we need an inside '1' and an outside '0'.

Let the substring length be L.

Since:

ones - zeros = 2

we obtain:

zeros = (L - 2) / 2

A corrective swap is possible only if at least one zero exists outside:

zeros < totalZeros

Rearranging:

L < 2 * totalZeros + 2

Thus we need the longest sum-2 substring whose length satisfies that bound. 6. Apply the same reasoning for substrings whose sum is -2.

A valid swap requires:

ones < totalOnes

which becomes:

L < 2 * totalOnes + 2 7. Store all indices for each prefix sum value. 8. For each position j, find an earlier index i such that:

pref[j] - pref[i] = K

where K is either 2 or -2.

Additionally, enforce the required length bound. 9. Use binary search on the stored prefix-sum positions to locate the earliest valid index that satisfies the bound and maximize the substring length. 10. Return the largest answer found.

Why it works

A single swap can only alter a substring's sum by ±2. Therefore every substring that can become balanced must currently have sum 0, 2, or -2. The additional outside-character requirement is captured exactly by the derived length bounds. The algorithm enumerates all such candidate substrings through prefix sums and computes the maximum valid length, so it always returns the optimal answer.

Python Solution

from bisect import bisect_right
from collections import defaultdict
from typing import List

class Solution:
    def longestBalanced(self, s: str) -> int:
        n = len(s)

        total_ones = s.count("1")
        total_zeros = n - total_ones

        prefix = [0]
        for ch in s:
            prefix.append(prefix[-1] + (1 if ch == "1" else -1))

        ans = 0

        earliest = {}
        for i, value in enumerate(prefix):
            if value not in earliest:
                earliest[value] = i
            ans = max(ans, i - earliest[value])

        positions = defaultdict(list)
        for i, value in enumerate(prefix):
            positions[value].append(i)

        def process(target_sum: int, length_bound: int) -> int:
            best = 0

            for j, value in enumerate(prefix):
                needed = value - target_sum

                if needed not in positions:
                    continue

                idx_list = positions[needed]

                pos = bisect_right(idx_list, j - length_bound)

                if pos < len(idx_list) and idx_list[pos] < j:
                    best = max(best, j - idx_list[pos])

            return best

        ans = max(ans, process(2, 2 * total_zeros + 2))
        ans = max(ans, process(-2, 2 * total_ones + 2))

        return ans

The implementation begins by building the prefix-sum array using +1 and -1 values. The longest already-balanced substring is found using the classic technique of storing the first occurrence of every prefix sum.

Next, all positions of every prefix sum value are stored in sorted lists. This allows binary searching for valid starting positions.

The helper function process() handles the sum-2 and sum--2 cases. For each endpoint, it finds the earliest starting position that both produces the desired substring sum and satisfies the derived length bound. The maximum valid length is tracked and returned.

Finally, the best result among the sum 0, 2, and -2 cases is returned.

Go Solution

package main

import "sort"

func longestBalanced(s string) int {
	n := len(s)

	totalOnes := 0
	for _, ch := range s {
		if ch == '1' {
			totalOnes++
		}
	}
	totalZeros := n - totalOnes

	prefix := make([]int, n+1)
	for i := 0; i < n; i++ {
		prefix[i+1] = prefix[i]
		if s[i] == '1' {
			prefix[i+1]++
		} else {
			prefix[i+1]--
		}
	}

	ans := 0

	earliest := make(map[int]int)
	for i, v := range prefix {
		if _, ok := earliest[v]; !ok {
			earliest[v] = i
		}
		if i-earliest[v] > ans {
			ans = i - earliest[v]
		}
	}

	positions := make(map[int][]int)
	for i, v := range prefix {
		positions[v] = append(positions[v], i)
	}

	process := func(targetSum int, lengthBound int) int {
		best := 0

		for j, v := range prefix {
			needed := v - targetSum

			list, ok := positions[needed]
			if !ok {
				continue
			}

			pos := sort.Search(len(list), func(i int) bool {
				return list[i] > j-lengthBound
			})

			if pos < len(list) && list[pos] < j {
				if j-list[pos] > best {
					best = j - list[pos]
				}
			}
		}

		return best
	}

	if v := process(2, 2*totalZeros+2); v > ans {
		ans = v
	}

	if v := process(-2, 2*totalOnes+2); v > ans {
		ans = v
	}

	return ans
}

The Go implementation follows exactly the same logic as the Python version. The primary difference is the use of sort.Search for binary search instead of bisect_right. All arithmetic comfortably fits within Go's int type because the string length is at most 10^5.

Worked Examples

Example 1

Input:

s = "100001"

Totals:

Quantity Value
Total Ones 2
Total Zeros 4

Encoded values:

1 0 0 0 0 1
+1 -1 -1 -1 -1 +1

Prefix sums:

Index Prefix
0 0
1 1
2 0
3 -1
4 -2
5 -3
6 -2

Longest sum-0 substring:

  • Prefix value 0 appears at indices 0 and 2
  • Length = 2

Answer so far:

2

Now check sum -2.

Length bound:

2 * totalOnes + 2
= 2 * 2 + 2
= 6

Substring "0001" has:

ones = 1
zeros = 3
sum = -2
length = 4

Since 4 < 6, a corrective swap exists.

After one swap:

100001 -> 101000

A balanced substring of length 4 becomes possible.

Final answer:

4

Example 2

Input:

s = "111"

Totals:

Quantity Value
Total Ones 3
Total Zeros 0

Prefix sums:

Index Prefix
0 0
1 1
2 2
3 3

No equal prefix sums exist, so there is no sum-0 substring.

There are no zeros anywhere in the string, so no swap can create a balanced substring.

Final answer:

0

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Each prefix position performs binary searches
Space O(n) Prefix sums and index lists are stored

The prefix array contains n + 1 entries. Every prefix position participates in a constant number of binary searches over stored index lists. Since each search takes O(log n) time, the overall complexity is O(n log n).

Test Cases

sol = Solution()

assert sol.longestBalanced("100001") == 4      # provided example
assert sol.longestBalanced("111") == 0         # all ones

assert sol.longestBalanced("0") == 0          # single character
assert sol.longestBalanced("1") == 0          # single character

assert sol.longestBalanced("01") == 2         # already balanced
assert sol.longestBalanced("10") == 2         # already balanced

assert sol.longestBalanced("0011") == 4       # whole string balanced
assert sol.longestBalanced("1100") == 4       # whole string balanced

assert sol.longestBalanced("111000") == 6     # balanced whole string
assert sol.longestBalanced("000111") == 6     # balanced whole string

assert sol.longestBalanced("00000") == 0      # all zeros
assert sol.longestBalanced("11111") == 0      # all ones

assert sol.longestBalanced("101") == 2        # small odd length
assert sol.longestBalanced("010") == 2        # small odd length

assert sol.longestBalanced("1001") == 4       # already balanced
assert sol.longestBalanced("11001") == 4      # requires checking swap case

assert sol.longestBalanced("101000") == 6     # whole string already balanced

Test Summary

Test Why
"100001" Official example requiring a swap
"111" Official example with no valid answer
"0" Smallest zero-only input
"1" Smallest one-only input
"01" Smallest balanced string
"10" Balanced in reverse order
"0011" Entire string balanced
"1100" Entire string balanced
"111000" Large balanced region
"000111" Symmetric balanced case
"00000" No ones available
"11111" No zeros available
"101" Odd length input
"010" Odd length input
"1001" Balanced whole string
"11001" Swap-related candidate
"101000" Full-length balanced answer

Edge Cases

All Characters Are Identical

Strings such as "11111" or "00000" contain only one type of character. A balanced substring requires both zeros and ones, so no non-empty balanced substring can exist. The algorithm handles this naturally because no valid sum-0, sum-2, or sum-(-2) substring satisfies the swap conditions.

The Best Substring Is Already Balanced

For inputs like "0011" or "111000", the optimal answer is the entire string and no swap is necessary. The longest sum-0 substring computation directly captures this case through repeated prefix sums.

A Swap Requires a Character Outside the Substring

A subtle bug source is assuming every sum-2 or sum-(-2) substring can be fixed. Consider a substring that already contains all zeros in the entire string. There is no outside zero available to swap in, so correction is impossible. The derived bounds L < 2 * totalZeros + 2 and L < 2 * totalOnes + 2 encode exactly this requirement and prevent invalid substrings from being counted.

Very Large Inputs

Since n can reach 100000, any quadratic solution will time out. The implementation stores prefix-sum positions and uses binary search, keeping the runtime at O(n log n) and making it efficient for the maximum input size.