LeetCode 1717 - Maximum Score From Removing Substrings
The problem gives us a string s and two scoring rules: - Removing the substring "ab" earns x points. - Removing the substring "ba" earns y points. We may perform these removals as many times as possible, in any order we choose.
Difficulty: 🟡 Medium
Topics: String, Stack, Greedy
Solution
Problem Understanding
The problem gives us a string s and two scoring rules:
- Removing the substring
"ab"earnsxpoints. - Removing the substring
"ba"earnsypoints.
We may perform these removals as many times as possible, in any order we choose. Every time we remove a substring, the remaining characters close together, which may create new removable substrings. The goal is to maximize the total score.
The important detail is that the order of removals matters. Removing one substring can prevent another removal later, or create a new opportunity. Because of this, a greedy strategy is required instead of simply counting occurrences.
For example, consider the string "aba":
- If we remove
"ab"first, we gainxpoints and are left with"a". - If we remove
"ba"first, we gainypoints and are left with"a".
We can only perform one removal, so we should choose whichever gives the larger score.
The input size can be as large as 10^5, which immediately tells us that expensive backtracking or recursive exploration of all possibilities will not work. We need a solution that is close to linear time.
Another important observation is that only the characters 'a' and 'b' matter for scoring. Any other character acts like a separator because substrings cannot cross over it. For example:
"aabxbaa"
The character 'x' splits the problem into two independent sections:
"aab" and "baa"
This greatly simplifies the structure of the problem.
Several edge cases are important:
- Strings with no
"ab"or"ba"substrings should return0. - Strings containing only
'a'or only'b'cannot produce any score. - When
x > y, removing"ab"first is optimal. - When
y > x, removing"ba"first is optimal. - Large alternating patterns such as
"abababab"can produce many overlapping choices and will break naive greedy implementations that do not process removals carefully.
Approaches
Brute Force Approach
A brute force solution would try every possible removal sequence.
At each step:
- Search the current string for all
"ab"and"ba"substrings. - Remove one possibility.
- Recursively continue on the smaller string.
- Track the maximum score among all choices.
This approach is correct because it explores every valid sequence of operations. Eventually, it finds the optimal score.
However, the number of possible removal orders grows exponentially. Every removal changes the string structure, creating many branching possibilities. Even for moderate string lengths, the number of states becomes enormous.
Since the string length can reach 10^5, this approach is completely infeasible.
Key Insight
The critical observation is that we should always prioritize the higher scoring substring first.
Suppose:
x > y
This means removing "ab" is more valuable than removing "ba".
If we remove a lower value pair first, we may destroy an opportunity to remove a higher value pair later. Therefore, greedily taking the higher value pair whenever possible is optimal.
For example:
"abab"
If x = 10 and y = 1:
- Removing
"ba"first gives only1. - Removing
"ab"first allows two"ab"removals for a total of20.
This leads to the optimal strategy:
- Remove all occurrences of the higher value substring first.
- Then remove all occurrences of the lower value substring from the remaining string.
A stack is ideal for this because substring removals depend on adjacent characters that change dynamically after deletions.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | Exponential | Exponential | Tries every possible removal order |
| Optimal | O(n) | O(n) | Greedily removes higher value substring first using stacks |
Algorithm Walkthrough
Step 1: Decide Which Substring Has Higher Priority
Compare x and y.
- If
x >= y, remove"ab"first. - Otherwise, remove
"ba"first.
This ensures we greedily maximize high value removals before lower value ones can interfere.
Step 2: Create a Helper Function for Stack-Based Removal
We use a stack to simulate removals efficiently.
For each character in the string:
- If the top of the stack and the current character form the target substring, pop the stack and add points.
- Otherwise, push the character onto the stack.
This works because adjacent characters after removals naturally become neighbors in the stack representation.
For example, removing "ab" from "cabxbae":
| Character | Stack Before | Action | Stack After |
|---|---|---|---|
| c | [] | push | [c] |
| a | [c] | push | [c,a] |
| b | [c,a] | remove "ab" | [c] |
| x | [c] | push | [c,x] |
Step 3: Remove All High Value Substrings
Run the helper function once on the original string.
This pass:
- Maximizes high value removals.
- Produces a leftover string after all possible greedy deletions.
Step 4: Remove All Low Value Substrings
Run the helper function again on the remaining string using the second substring pattern.
At this point, all high priority opportunities are already exhausted, so removing lower value pairs cannot hurt optimality.
Step 5: Return the Total Score
Add the scores from both passes and return the result.
Why it works
The key invariant is that removing a higher scoring pair earlier can never reduce the number of higher scoring removals available later, while removing a lower scoring pair first may destroy a higher scoring opportunity.
Since each removal only affects local adjacency, greedily consuming the more valuable pattern first guarantees maximum profit. The stack correctly models dynamic adjacency after deletions, ensuring every valid removal is detected exactly once.
Python Solution
class Solution:
def maximumGain(self, s: str, x: int, y: int) -> int:
def remove_substring(text: str, first: str, second: str, points: int):
stack = []
score = 0
for char in text:
if stack and stack[-1] == first and char == second:
stack.pop()
score += points
else:
stack.append(char)
return "".join(stack), score
total_score = 0
if x >= y:
remaining, score = remove_substring(s, 'a', 'b', x)
total_score += score
remaining, score = remove_substring(remaining, 'b', 'a', y)
total_score += score
else:
remaining, score = remove_substring(s, 'b', 'a', y)
total_score += score
remaining, score = remove_substring(remaining, 'a', 'b', x)
total_score += score
return total_score
The implementation uses a helper function called remove_substring. This function performs one complete greedy pass for a specific substring pattern.
The stack stores characters that have not yet been matched into removable pairs. Whenever the current character and the stack top form the target substring, the stack top is removed and points are added.
The helper function returns two things:
- The remaining string after all removals.
- The total score gained from that pass.
The main function first determines which substring has higher value. It processes that pattern first, then processes the lower value pattern on the leftover string.
This exactly follows the greedy strategy proven optimal earlier.
Go Solution
func maximumGain(s string, x int, y int) int {
removeSubstring := func(text string, first byte, second byte, points int) (string, int) {
stack := make([]byte, 0)
score := 0
for i := 0; i < len(text); i++ {
char := text[i]
if len(stack) > 0 && stack[len(stack)-1] == first && char == second {
stack = stack[:len(stack)-1]
score += points
} else {
stack = append(stack, char)
}
}
return string(stack), score
}
totalScore := 0
if x >= y {
remaining, score := removeSubstring(s, 'a', 'b', x)
totalScore += score
_, score = removeSubstring(remaining, 'b', 'a', y)
totalScore += score
} else {
remaining, score := removeSubstring(s, 'b', 'a', y)
totalScore += score
_, score = removeSubstring(remaining, 'a', 'b', x)
totalScore += score
}
return totalScore
}
The Go solution follows the exact same logic as the Python version. The primary difference is that slices of bytes are used instead of Python lists.
The stack is implemented using a []byte slice. Removing the top element is done with slicing:
stack = stack[:len(stack)-1]
Strings in Go are immutable, so the remaining characters are converted back into a string using:
string(stack)
Integer overflow is not a concern because the maximum possible score is well within Go's int range.
Worked Examples
Example 1
Input:
s = "cdbcbbaaabab"
x = 4
y = 5
Since y > x, we remove "ba" first.
First Pass, Remove "ba"
| Character | Stack Before | Action | Stack After | Score |
|---|---|---|---|---|
| c | [] | push | [c] | 0 |
| d | [c] | push | [c,d] | 0 |
| b | [c,d] | push | [c,d,b] | 0 |
| c | [c,d,b] | push | [c,d,b,c] | 0 |
| b | [c,d,b,c] | push | [c,d,b,c,b] | 0 |
| b | [c,d,b,c,b] | push | [c,d,b,c,b,b] | 0 |
| a | [c,d,b,c,b,b] | remove "ba" | [c,d,b,c,b] | 5 |
| a | [c,d,b,c,b] | remove "ba" | [c,d,b,c] | 10 |
| a | [c,d,b,c] | push | [c,d,b,c,a] | 10 |
| b | [c,d,b,c,a] | push | [c,d,b,c,a,b] | 10 |
| a | [c,d,b,c,a,b] | remove "ba" | [c,d,b,c,a] | 15 |
| b | [c,d,b,c,a] | push | [c,d,b,c,a,b] | 15 |
Remaining string:
"cdbcab"
Score so far:
15
Second Pass, Remove "ab"
| Character | Stack Before | Action | Stack After | Score |
|---|---|---|---|---|
| c | [] | push | [c] | 0 |
| d | [c] | push | [c,d] | 0 |
| b | [c,d] | push | [c,d,b] | 0 |
| c | [c,d,b] | push | [c,d,b,c] | 0 |
| a | [c,d,b,c] | push | [c,d,b,c,a] | 0 |
| b | [c,d,b,c,a] | remove "ab" | [c,d,b,c] | 4 |
Final score:
15 + 4 = 19
Example 2
Input:
s = "aabbaaxybbaabb"
x = 5
y = 4
Since x > y, remove "ab" first.
First Pass, Remove "ab"
| Character | Stack Before | Action | Stack After | Score |
|---|---|---|---|---|
| a | [] | push | [a] | 0 |
| a | [a] | push | [a,a] | 0 |
| b | [a,a] | remove "ab" | [a] | 5 |
| b | [a] | remove "ab" | [] | 10 |
| a | [] | push | [a] | 10 |
| a | [a] | push | [a,a] | 10 |
| x | [a,a] | push | [a,a,x] | 10 |
| b | [a,a,x] | push | [a,a,x,b] | 10 |
| b | [a,a,x,b] | push | [a,a,x,b,b] | 10 |
| a | [a,a,x,b,b] | push | [a,a,x,b,b,a] | 10 |
| a | [a,a,x,b,b,a] | push | [a,a,x,b,b,a,a] | 10 |
| b | [a,a,x,b,b,a,a] | remove "ab" | [a,a,x,b,b,a] | 15 |
| b | [a,a,x,b,b,a] | remove "ab" | [a,a,x,b,b] | 20 |
Final answer:
20
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each character is pushed and popped at most once in each pass |
| Space | O(n) | The stack may store the entire string in the worst case |
The algorithm performs at most two linear scans of the string. During each scan, every character enters and leaves the stack at most once. This guarantees linear time complexity.
The auxiliary space comes from the stack used to simulate removals. In the worst case, no substrings are removed, so the stack may contain the entire string.
Test Cases
solution = Solution()
assert solution.maximumGain("cdbcbbaaabab", 4, 5) == 19
# Example 1, removing "ba" first is optimal
assert solution.maximumGain("aabbaaxybbaabb", 5, 4) == 20
# Example 2, removing "ab" first is optimal
assert solution.maximumGain("aaaa", 3, 4) == 0
# No removable substrings
assert solution.maximumGain("bbbb", 3, 4) == 0
# Only one character type
assert solution.maximumGain("ab", 10, 1) == 10
# Single "ab" removal
assert solution.maximumGain("ba", 1, 10) == 10
# Single "ba" removal
assert solution.maximumGain("aba", 10, 1) == 10
# Greedy choice matters
assert solution.maximumGain("aba", 1, 10) == 10
# Opposite greedy direction
assert solution.maximumGain("abab", 5, 4) == 10
# Multiple overlapping removals
assert solution.maximumGain("baba", 4, 5) == 10
# Prefer higher value "ba"
assert solution.maximumGain("abcde", 10, 10) == 10
# Only one removable pair
assert solution.maximumGain("", 1, 1) == 0
# Empty string edge case
assert solution.maximumGain("xxyyzz", 5, 5) == 0
# No relevant characters
assert solution.maximumGain("abba", 5, 4) == 9
# Both patterns can be removed
assert solution.maximumGain("abababab", 3, 2) == 12
# Long alternating pattern
Test Summary
| Test | Why |
|---|---|
"cdbcbbaaabab" |
Validates mixed removals and greedy ordering |
"aabbaaxybbaabb" |
Validates opposite greedy priority |
"aaaa" |
Ensures no removals occur |
"bbbb" |
Ensures single character strings work |
"ab" |
Smallest valid "ab" case |
"ba" |
Smallest valid "ba" case |
"aba" with different scores |
Confirms greedy strategy correctness |
"abab" |
Tests overlapping removals |
"baba" |
Tests repeated "ba" removals |
"abcde" |
Handles unrelated characters correctly |
"" |
Empty input boundary case |
"xxyyzz" |
No scoring substrings present |
"abba" |
Both removal types available |
"abababab" |
Stress test for repeated removals |
Edge Cases
One important edge case occurs when the string contains no removable substrings at all. For example:
"ccccddd"
A buggy implementation might still attempt stack operations incorrectly or assume at least one removal exists. The provided solution safely processes every character and simply returns 0 when no matching adjacent pairs are found.
Another critical edge case involves overlapping choices such as:
"aba"
Both "ab" and "ba" appear, but only one removal can actually happen. A naive implementation that greedily removes whichever appears first in the string could produce a suboptimal answer. The algorithm avoids this by globally prioritizing the higher scoring substring before processing the lower scoring one.
A third tricky case involves separator characters:
"abxba"
The 'x' prevents interactions between the left and right sections. Incorrect implementations may accidentally allow removals across separators after string reconstruction. The stack-based method naturally preserves boundaries because characters are processed sequentially and only adjacent stack characters can form removable pairs.
Finally, long alternating strings such as:
"abababababab"
can generate many overlapping removal opportunities. Recursive or repeated string slicing implementations may become extremely slow due to repeated string copying. The stack approach guarantees linear performance because every character is pushed and popped at most once per pass.