LeetCode 1717 - Maximum Score From Removing Substrings

The problem gives us a string s and two scoring rules: - Removing the substring "ab" earns x points. - Removing the substring "ba" earns y points. We may perform these removals as many times as possible, in any order we choose.

LeetCode Problem 1717

Difficulty: 🟡 Medium
Topics: String, Stack, Greedy

Solution

Problem Understanding

The problem gives us a string s and two scoring rules:

  • Removing the substring "ab" earns x points.
  • Removing the substring "ba" earns y points.

We may perform these removals as many times as possible, in any order we choose. Every time we remove a substring, the remaining characters close together, which may create new removable substrings. The goal is to maximize the total score.

The important detail is that the order of removals matters. Removing one substring can prevent another removal later, or create a new opportunity. Because of this, a greedy strategy is required instead of simply counting occurrences.

For example, consider the string "aba":

  • If we remove "ab" first, we gain x points and are left with "a".
  • If we remove "ba" first, we gain y points and are left with "a".

We can only perform one removal, so we should choose whichever gives the larger score.

The input size can be as large as 10^5, which immediately tells us that expensive backtracking or recursive exploration of all possibilities will not work. We need a solution that is close to linear time.

Another important observation is that only the characters 'a' and 'b' matter for scoring. Any other character acts like a separator because substrings cannot cross over it. For example:

"aabxbaa"

The character 'x' splits the problem into two independent sections:

"aab" and "baa"

This greatly simplifies the structure of the problem.

Several edge cases are important:

  • Strings with no "ab" or "ba" substrings should return 0.
  • Strings containing only 'a' or only 'b' cannot produce any score.
  • When x > y, removing "ab" first is optimal.
  • When y > x, removing "ba" first is optimal.
  • Large alternating patterns such as "abababab" can produce many overlapping choices and will break naive greedy implementations that do not process removals carefully.

Approaches

Brute Force Approach

A brute force solution would try every possible removal sequence.

At each step:

  1. Search the current string for all "ab" and "ba" substrings.
  2. Remove one possibility.
  3. Recursively continue on the smaller string.
  4. Track the maximum score among all choices.

This approach is correct because it explores every valid sequence of operations. Eventually, it finds the optimal score.

However, the number of possible removal orders grows exponentially. Every removal changes the string structure, creating many branching possibilities. Even for moderate string lengths, the number of states becomes enormous.

Since the string length can reach 10^5, this approach is completely infeasible.

Key Insight

The critical observation is that we should always prioritize the higher scoring substring first.

Suppose:

  • x > y

This means removing "ab" is more valuable than removing "ba".

If we remove a lower value pair first, we may destroy an opportunity to remove a higher value pair later. Therefore, greedily taking the higher value pair whenever possible is optimal.

For example:

"abab"

If x = 10 and y = 1:

  • Removing "ba" first gives only 1.
  • Removing "ab" first allows two "ab" removals for a total of 20.

This leads to the optimal strategy:

  1. Remove all occurrences of the higher value substring first.
  2. Then remove all occurrences of the lower value substring from the remaining string.

A stack is ideal for this because substring removals depend on adjacent characters that change dynamically after deletions.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force Exponential Exponential Tries every possible removal order
Optimal O(n) O(n) Greedily removes higher value substring first using stacks

Algorithm Walkthrough

Step 1: Decide Which Substring Has Higher Priority

Compare x and y.

  • If x >= y, remove "ab" first.
  • Otherwise, remove "ba" first.

This ensures we greedily maximize high value removals before lower value ones can interfere.

Step 2: Create a Helper Function for Stack-Based Removal

We use a stack to simulate removals efficiently.

For each character in the string:

  • If the top of the stack and the current character form the target substring, pop the stack and add points.
  • Otherwise, push the character onto the stack.

This works because adjacent characters after removals naturally become neighbors in the stack representation.

For example, removing "ab" from "cabxbae":

Character Stack Before Action Stack After
c [] push [c]
a [c] push [c,a]
b [c,a] remove "ab" [c]
x [c] push [c,x]

Step 3: Remove All High Value Substrings

Run the helper function once on the original string.

This pass:

  • Maximizes high value removals.
  • Produces a leftover string after all possible greedy deletions.

Step 4: Remove All Low Value Substrings

Run the helper function again on the remaining string using the second substring pattern.

At this point, all high priority opportunities are already exhausted, so removing lower value pairs cannot hurt optimality.

Step 5: Return the Total Score

Add the scores from both passes and return the result.

Why it works

The key invariant is that removing a higher scoring pair earlier can never reduce the number of higher scoring removals available later, while removing a lower scoring pair first may destroy a higher scoring opportunity.

Since each removal only affects local adjacency, greedily consuming the more valuable pattern first guarantees maximum profit. The stack correctly models dynamic adjacency after deletions, ensuring every valid removal is detected exactly once.

Python Solution

class Solution:
    def maximumGain(self, s: str, x: int, y: int) -> int:
        def remove_substring(text: str, first: str, second: str, points: int):
            stack = []
            score = 0

            for char in text:
                if stack and stack[-1] == first and char == second:
                    stack.pop()
                    score += points
                else:
                    stack.append(char)

            return "".join(stack), score

        total_score = 0

        if x >= y:
            remaining, score = remove_substring(s, 'a', 'b', x)
            total_score += score

            remaining, score = remove_substring(remaining, 'b', 'a', y)
            total_score += score
        else:
            remaining, score = remove_substring(s, 'b', 'a', y)
            total_score += score

            remaining, score = remove_substring(remaining, 'a', 'b', x)
            total_score += score

        return total_score

The implementation uses a helper function called remove_substring. This function performs one complete greedy pass for a specific substring pattern.

The stack stores characters that have not yet been matched into removable pairs. Whenever the current character and the stack top form the target substring, the stack top is removed and points are added.

The helper function returns two things:

  • The remaining string after all removals.
  • The total score gained from that pass.

The main function first determines which substring has higher value. It processes that pattern first, then processes the lower value pattern on the leftover string.

This exactly follows the greedy strategy proven optimal earlier.

Go Solution

func maximumGain(s string, x int, y int) int {
	removeSubstring := func(text string, first byte, second byte, points int) (string, int) {
		stack := make([]byte, 0)
		score := 0

		for i := 0; i < len(text); i++ {
			char := text[i]

			if len(stack) > 0 && stack[len(stack)-1] == first && char == second {
				stack = stack[:len(stack)-1]
				score += points
			} else {
				stack = append(stack, char)
			}
		}

		return string(stack), score
	}

	totalScore := 0

	if x >= y {
		remaining, score := removeSubstring(s, 'a', 'b', x)
		totalScore += score

		_, score = removeSubstring(remaining, 'b', 'a', y)
		totalScore += score
	} else {
		remaining, score := removeSubstring(s, 'b', 'a', y)
		totalScore += score

		_, score = removeSubstring(remaining, 'a', 'b', x)
		totalScore += score
	}

	return totalScore
}

The Go solution follows the exact same logic as the Python version. The primary difference is that slices of bytes are used instead of Python lists.

The stack is implemented using a []byte slice. Removing the top element is done with slicing:

stack = stack[:len(stack)-1]

Strings in Go are immutable, so the remaining characters are converted back into a string using:

string(stack)

Integer overflow is not a concern because the maximum possible score is well within Go's int range.

Worked Examples

Example 1

Input:

s = "cdbcbbaaabab"
x = 4
y = 5

Since y > x, we remove "ba" first.

First Pass, Remove "ba"

Character Stack Before Action Stack After Score
c [] push [c] 0
d [c] push [c,d] 0
b [c,d] push [c,d,b] 0
c [c,d,b] push [c,d,b,c] 0
b [c,d,b,c] push [c,d,b,c,b] 0
b [c,d,b,c,b] push [c,d,b,c,b,b] 0
a [c,d,b,c,b,b] remove "ba" [c,d,b,c,b] 5
a [c,d,b,c,b] remove "ba" [c,d,b,c] 10
a [c,d,b,c] push [c,d,b,c,a] 10
b [c,d,b,c,a] push [c,d,b,c,a,b] 10
a [c,d,b,c,a,b] remove "ba" [c,d,b,c,a] 15
b [c,d,b,c,a] push [c,d,b,c,a,b] 15

Remaining string:

"cdbcab"

Score so far:

15

Second Pass, Remove "ab"

Character Stack Before Action Stack After Score
c [] push [c] 0
d [c] push [c,d] 0
b [c,d] push [c,d,b] 0
c [c,d,b] push [c,d,b,c] 0
a [c,d,b,c] push [c,d,b,c,a] 0
b [c,d,b,c,a] remove "ab" [c,d,b,c] 4

Final score:

15 + 4 = 19

Example 2

Input:

s = "aabbaaxybbaabb"
x = 5
y = 4

Since x > y, remove "ab" first.

First Pass, Remove "ab"

Character Stack Before Action Stack After Score
a [] push [a] 0
a [a] push [a,a] 0
b [a,a] remove "ab" [a] 5
b [a] remove "ab" [] 10
a [] push [a] 10
a [a] push [a,a] 10
x [a,a] push [a,a,x] 10
b [a,a,x] push [a,a,x,b] 10
b [a,a,x,b] push [a,a,x,b,b] 10
a [a,a,x,b,b] push [a,a,x,b,b,a] 10
a [a,a,x,b,b,a] push [a,a,x,b,b,a,a] 10
b [a,a,x,b,b,a,a] remove "ab" [a,a,x,b,b,a] 15
b [a,a,x,b,b,a] remove "ab" [a,a,x,b,b] 20

Final answer:

20

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character is pushed and popped at most once in each pass
Space O(n) The stack may store the entire string in the worst case

The algorithm performs at most two linear scans of the string. During each scan, every character enters and leaves the stack at most once. This guarantees linear time complexity.

The auxiliary space comes from the stack used to simulate removals. In the worst case, no substrings are removed, so the stack may contain the entire string.

Test Cases

solution = Solution()

assert solution.maximumGain("cdbcbbaaabab", 4, 5) == 19
# Example 1, removing "ba" first is optimal

assert solution.maximumGain("aabbaaxybbaabb", 5, 4) == 20
# Example 2, removing "ab" first is optimal

assert solution.maximumGain("aaaa", 3, 4) == 0
# No removable substrings

assert solution.maximumGain("bbbb", 3, 4) == 0
# Only one character type

assert solution.maximumGain("ab", 10, 1) == 10
# Single "ab" removal

assert solution.maximumGain("ba", 1, 10) == 10
# Single "ba" removal

assert solution.maximumGain("aba", 10, 1) == 10
# Greedy choice matters

assert solution.maximumGain("aba", 1, 10) == 10
# Opposite greedy direction

assert solution.maximumGain("abab", 5, 4) == 10
# Multiple overlapping removals

assert solution.maximumGain("baba", 4, 5) == 10
# Prefer higher value "ba"

assert solution.maximumGain("abcde", 10, 10) == 10
# Only one removable pair

assert solution.maximumGain("", 1, 1) == 0
# Empty string edge case

assert solution.maximumGain("xxyyzz", 5, 5) == 0
# No relevant characters

assert solution.maximumGain("abba", 5, 4) == 9
# Both patterns can be removed

assert solution.maximumGain("abababab", 3, 2) == 12
# Long alternating pattern

Test Summary

Test Why
"cdbcbbaaabab" Validates mixed removals and greedy ordering
"aabbaaxybbaabb" Validates opposite greedy priority
"aaaa" Ensures no removals occur
"bbbb" Ensures single character strings work
"ab" Smallest valid "ab" case
"ba" Smallest valid "ba" case
"aba" with different scores Confirms greedy strategy correctness
"abab" Tests overlapping removals
"baba" Tests repeated "ba" removals
"abcde" Handles unrelated characters correctly
"" Empty input boundary case
"xxyyzz" No scoring substrings present
"abba" Both removal types available
"abababab" Stress test for repeated removals

Edge Cases

One important edge case occurs when the string contains no removable substrings at all. For example:

"ccccddd"

A buggy implementation might still attempt stack operations incorrectly or assume at least one removal exists. The provided solution safely processes every character and simply returns 0 when no matching adjacent pairs are found.

Another critical edge case involves overlapping choices such as:

"aba"

Both "ab" and "ba" appear, but only one removal can actually happen. A naive implementation that greedily removes whichever appears first in the string could produce a suboptimal answer. The algorithm avoids this by globally prioritizing the higher scoring substring before processing the lower scoring one.

A third tricky case involves separator characters:

"abxba"

The 'x' prevents interactions between the left and right sections. Incorrect implementations may accidentally allow removals across separators after string reconstruction. The stack-based method naturally preserves boundaries because characters are processed sequentially and only adjacent stack characters can form removable pairs.

Finally, long alternating strings such as:

"abababababab"

can generate many overlapping removal opportunities. Recursive or repeated string slicing implementations may become extremely slow due to repeated string copying. The stack approach guarantees linear performance because every character is pushed and popped at most once per pass.