LeetCode 1898 - Maximum Number of Removable Characters

The problem gives us two strings, s and p, and a list of indices removable. The string p is guaranteed to be a subsequence of s, meaning all characters in p appear in s in the same order, though not necessarily consecutively.

LeetCode Problem 1898

Difficulty: 🟡 Medium
Topics: Array, Two Pointers, String, Binary Search

Solution

Problem Understanding

The problem gives us two strings, s and p, and a list of indices removable. The string p is guaranteed to be a subsequence of s, meaning all characters in p appear in s in the same order, though not necessarily consecutively. The array removable represents positions in s that we are allowed to remove in sequence. Our task is to find the largest number k such that if we remove the first k characters from s as specified by removable, p remains a subsequence of the modified s.

In other words, we are looking for the maximum number of removable characters that we can delete from s without breaking the order of p. The constraints tell us that the strings can be quite long (up to 100,000 characters), so any solution must be efficient and ideally faster than O(n * m) for repeated checks, where n and m are the lengths of s and p.

Important edge cases include situations where removing even a single character breaks the subsequence, where removable is empty, and where all characters in removable can be removed safely.

Approaches

A brute-force approach would try all possible values of k from 0 up to the length of removable. For each k, it would remove the first k indices from s and check if p is still a subsequence. While correct, this approach is inefficient because for each k it might scan the string s entirely to check the subsequence. This results in a worst-case complexity of O(n * r) where r is the length of removable.

The key observation for an optimal solution is that the property "p is a subsequence of s after removing the first k indices" is monotonic. If p is a subsequence after removing k characters, it will also be a subsequence for any smaller number of removals. Conversely, if p is not a subsequence after removing k characters, it will fail for any larger k. This monotonicity allows us to apply binary search on k. For each candidate k, we check if p remains a subsequence efficiently using a boolean array to mark removed characters and a two-pointer scan.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * r) O(n) Try every possible k and check subsequence
Optimal O(n * log r) O(n) Binary search on k with efficient subsequence check

Algorithm Walkthrough

  1. Initialize a binary search range with left = 0 and right = len(removable). We are searching for the maximum k in this range.
  2. For each candidate mid = (left + right) // 2, mark the first mid indices in removable as removed using a boolean array of length len(s).
  3. Use a two-pointer approach to check if p is still a subsequence of s after these removals. Pointer i scans s while pointer j scans p. When s[i] matches p[j] and i is not removed, increment j.
  4. If j reaches len(p), p is still a subsequence. Update left = mid + 1 to try more removals.
  5. If j does not reach len(p), p is no longer a subsequence. Update right = mid - 1.
  6. Repeat until left > right. The maximum valid k is right.

Why it works: Binary search leverages the monotonic property of subsequence validity with respect to removals. The two-pointer check ensures that each subsequence verification is linear in s, making the approach efficient.

Python Solution

from typing import List

class Solution:
    def maximumRemovals(self, s: str, p: str, removable: List[int]) -> int:
        def is_subsequence(k: int) -> bool:
            removed = [False] * len(s)
            for i in range(k):
                removed[removable[i]] = True
            
            j = 0
            for i in range(len(s)):
                if not removed[i] and s[i] == p[j]:
                    j += 1
                    if j == len(p):
                        return True
            return j == len(p)
        
        left, right = 0, len(removable)
        max_k = 0
        while left <= right:
            mid = (left + right) // 2
            if is_subsequence(mid):
                max_k = mid
                left = mid + 1
            else:
                right = mid - 1
        return max_k

This implementation defines a helper is_subsequence(k) to check if p remains a subsequence after removing the first k indices. The binary search adjusts left and right based on this check, ensuring we find the maximum valid k.

Go Solution

func maximumRemovals(s string, p string, removable []int) int {
    isSubsequence := func(k int) bool {
        removed := make([]bool, len(s))
        for i := 0; i < k; i++ {
            removed[removable[i]] = true
        }
        j := 0
        for i := 0; i < len(s); i++ {
            if !removed[i] && s[i] == p[j] {
                j++
                if j == len(p) {
                    return true
                }
            }
        }
        return j == len(p)
    }
    
    left, right := 0, len(removable)
    maxK := 0
    for left <= right {
        mid := (left + right) / 2
        if isSubsequence(mid) {
            maxK = mid
            left = mid + 1
        } else {
            right = mid - 1
        }
    }
    return maxK
}

The Go implementation mirrors the Python logic. The main differences are explicit slice creation for removed and integer division for computing mid.

Worked Examples

Example 1: s = "abcacb", p = "ab", removable = [3,1,0]

k Removed indices s after removal Is subsequence?
0 [] abcacb Yes
1 [3] abccb Yes
2 [3,1] accb Yes
3 [3,1,0] ccb No

Maximum k = 2.

Example 2: s = "abcbddddd", p = "abcd", removable = [3,2,1,4,5,6]

Binary search checks will determine k = 1 as the maximum.

Example 3: s = "abcab", p = "abc", removable = [0,1,2,3,4]

Even k = 1 fails, so maximum k = 0.

Complexity Analysis

Measure Complexity Explanation
Time O(n * log r) Binary search on removable length r, each check scans s of length n
Space O(n) Boolean array to mark removed characters

The logarithmic factor comes from binary search over the possible number of removals, while each subsequence check is linear in the length of s.

Test Cases

# Provided examples
assert Solution().maximumRemovals("abcacb", "ab", [3,1,0]) == 2  # Example 1
assert Solution().maximumRemovals("abcbddddd", "abcd", [3,2,1,4,5,6]) == 1  # Example 2
assert Solution().maximumRemovals("abcab", "abc", [0,1,2,3,4]) == 0  # Example 3

# Edge cases
assert Solution().maximumRemovals("a", "a", []) == 0  # empty removable
assert Solution().maximumRemovals("a", "a", [0]) == 1  # single character can be removed
assert Solution().maximumRemovals("abcde", "ace", [1,3]) == 2  # remove middle characters safely
assert Solution().maximumRemovals("abcde", "ace", [0,2,4]) == 0  # removing characters in subsequence fails immediately
Test Why
Example 1 Checks standard case with multiple removals
Example 2 Tests that only partial removals are valid
Example 3 Tests case where no removals are allowed
Single character cases Ensure algorithm handles minimal input
Removing middle vs subsequence characters Verify subsequence check works correctly

Edge Cases

  1. Empty removable array: If removable is empty, the maximum k must be 0. The algorithm handles this automatically because the binary search range starts from 0.

2