LeetCode 2696 - Minimum String Length After Removing Substrings

The problem asks us to determine the minimum possible length of a string after repeatedly removing the substrings "AB" or "CD". We are given a string s consisting solely of uppercase English letters.

LeetCode Problem 2696

Difficulty: 🟢 Easy
Topics: String, Stack, Simulation

Solution

Problem Understanding

The problem asks us to determine the minimum possible length of a string after repeatedly removing the substrings "AB" or "CD". We are given a string s consisting solely of uppercase English letters. Each operation allows us to remove any occurrence of "AB" or "CD" from the string, and after removal, the remaining parts of the string concatenate. This concatenation can produce new occurrences of "AB" or "CD", which may then also be removed. The process continues until no more of these substrings exist in the string.

The input s is guaranteed to have a length between 1 and 100, which is small enough for algorithms that are at least O(n²) in the worst case. The output is a single integer representing the minimum length of the string after all possible removals.

Key points to note include that the removal of substrings is dynamic: removing one pair may create another removable pair. Edge cases involve strings that contain no removable substrings, strings that consist entirely of consecutive removable patterns, and strings where removable patterns overlap.

Approaches

The brute-force approach involves recursively removing all occurrences of "AB" or "CD" at each step and computing the minimum length from all possible sequences of removals. This guarantees correctness, but it is inefficient. The recursion can explore an exponential number of possibilities because each substring removal can create new removable substrings in multiple ways. For a string of length n, the time complexity can be approximated as O(2ⁿ), which is infeasible even for n = 100.

The optimal approach uses a stack-based simulation. The idea is to process the string character by character, maintaining a stack of characters that represents the "current state" of the string after removals. For each new character, we check whether it forms "AB" or "CD" with the top of the stack. If it does, we pop the top element (removing the substring). Otherwise, we push the new character onto the stack. At the end, the size of the stack gives the minimum possible length. This works efficiently because each character is pushed and popped at most once.

Approach Time Complexity Space Complexity Notes
Brute Force O(2ⁿ) O(n) Recursively remove all occurrences of "AB" or "CD" and track min length
Stack Simulation O(n) O(n) Process characters sequentially with a stack, removing substrings in one pass

Algorithm Walkthrough

  1. Initialize an empty stack to keep track of the characters in the processed string.
  2. Iterate over each character c in the input string s.
  3. For each character, check if the stack is non-empty and whether the top of the stack together with c forms a removable substring: "AB" if top is 'A' and c is 'B', or "CD" if top is 'C' and c is 'D'.
  4. If a removable substring is found, pop the top character from the stack. This simulates removing the substring.
  5. If no removable substring is found, push the current character c onto the stack.
  6. Continue until all characters in s are processed.
  7. The size of the stack at the end is the minimum possible length of the string after all removals.

Why it works: This algorithm works because it greedily removes any removable pair as soon as it appears, which is sufficient for finding the minimum length. Since each operation reduces the string length by 2, and the stack tracks all characters that cannot currently form a removable substring, no removable pair is missed.

Python Solution

class Solution:
    def minLength(self, s: str) -> int:
        stack = []
        for c in s:
            if stack and ((stack[-1] == 'A' and c == 'B') or (stack[-1] == 'C' and c == 'D')):
                stack.pop()
            else:
                stack.append(c)
        return len(stack)

The Python implementation iterates through each character in the string. It uses a stack to simulate the removal of "AB" and "CD" pairs. When a removable substring is detected, the stack's top element is popped, effectively removing both characters. If no pair is detected, the character is appended to the stack. At the end, len(stack) gives the minimum remaining length.

Go Solution

func minLength(s string) int {
    stack := []rune{}
    for _, c := range s {
        if len(stack) > 0 && ((stack[len(stack)-1] == 'A' && c == 'B') || (stack[len(stack)-1] == 'C' && c == 'D')) {
            stack = stack[:len(stack)-1]
        } else {
            stack = append(stack, c)
        }
    }
    return len(stack)
}

In Go, we use a slice of rune to represent the stack. Each character is iterated over with a range loop, and substring removal is simulated by slicing off the last element of the stack when a removable pair is found. The final length of the stack slice is returned.

Worked Examples

Example 1: s = "ABFCACDB"

Step Stack State Action
A [] push A → [A]
B [A] top=A, c=B → pop → []
F [] push F → [F]
C [F] push C → [F,C]
A [F,C] push A → [F,C,A]
C [F,C,A] push C → [F,C,A,C]
D [F,C,A,C] top=C, c=D → pop → [F,C,A]
B [F,C,A] push B → [F,C,A,B]

At the end, the stack length is 2 ([F,C]), matching the expected output.

Example 2: s = "ACBBD"

Step Stack State Action
A [] push A → [A]
C [A] push C → [A,C]
B [A,C] push B → [A,C,B]
B [A,C,B] push B → [A,C,B,B]
D [A,C,B,B] push D → [A,C,B,B,D]

No removable substrings exist, so final length is 5.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character is pushed and popped at most once from the stack.
Space O(n) Stack can store all characters in the worst case.

The time complexity is linear since each character is processed exactly once. The space complexity is linear because in the worst case, no removable substrings exist, and all characters are stored in the stack.

Test Cases

# Provided examples
assert Solution().minLength("ABFCACDB") == 2  # example 1
assert Solution().minLength("ACBBD") == 5     # example 2

# Edge cases
assert Solution().minLength("ABABAB") == 0    # repeated removable patterns
assert Solution().minLength("CDCDCD") == 0    # repeated removable patterns
assert Solution().minLength("A") == 1         # single character, nothing removable
assert Solution().minLength("B") == 1         # single character, nothing removable
assert Solution().minLength("ACBD") == 4      # no removable substring initially
assert Solution().minLength("ABCDABCD") == 0  # interleaved patterns
Test Why
"ABFCACDB" Tests chain removals producing new substrings
"ACBBD" Tests string with no removable patterns
"ABABAB" Tests consecutive removable substrings
"CDCDCD" Tests consecutive removable substrings of a different type
"A", "B" Tests minimum length strings
"ACBD" Tests no removable substring possible
"ABCDABCD" Tests multiple interleaved removable substrings

Edge Cases

1. Strings with consecutive removable pairs: Strings like "ABABAB" or "CDCDCD" can completely collapse. A naive approach that removes only the first occurrence each time may require multiple iterations. The stack-based approach handles this in a single pass.

2. Strings with no removable substrings: Input strings such as "XYZ" or "ACBD" do not contain "AB" or "CD". The algorithm correctly leaves them unchanged, returning the original length.

3. Strings where removing one pair creates another: An input like "ACABD" contains "AB" only after processing previous characters. The stack ensures that as characters accumulate, new removable substrings are detected immediately.

This stack-based solution robustly handles all these edge cases, ensuring correctness and optimal efficiency.