LeetCode 301 - Remove Invalid Parentheses

The problem asks us to remove the minimum number of invalid parentheses from a string so that the remaining string becomes valid. The input string may contain lowercase English letters in addition to parentheses.

LeetCode Problem 301

Difficulty: 🔴 Hard
Topics: String, Backtracking, Breadth-First Search

Solution

Problem Understanding

The problem asks us to remove the minimum number of invalid parentheses from a string so that the remaining string becomes valid. The input string may contain lowercase English letters in addition to parentheses. Letters do not affect validity and should always remain in the resulting strings unless they are removed as part of a broader deletion process. The goal is to return every unique valid string that can be produced with the fewest possible removals.

A valid parentheses string follows the standard matching rules. Every opening parenthesis ( must eventually be matched by a closing parenthesis ), and at no point during a left-to-right scan can the number of closing parentheses exceed the number of opening parentheses. Letters can appear anywhere and are ignored when checking validity.

The key detail in this problem is that we are not asked to return just one valid string. We must return all unique valid strings that can be formed after removing the minimum number of parentheses. This means two things matter simultaneously:

  • The resulting strings must be valid.
  • The number of removals must be as small as possible.

The constraints are relatively small. The string length is at most 25, and there are at most 20 parentheses. These limits strongly suggest that exponential exploration is acceptable if pruning is done carefully. A brute-force solution that generates every possible subsequence would still be too large in the worst case, but a carefully controlled search using backtracking or breadth-first search becomes practical.

Several edge cases are important:

  • Strings that are already valid, such as "()()", should return the original string unchanged.
  • Strings containing only invalid parentheses, such as ")(", should return [""].
  • Duplicate results must be avoided because different removal paths can produce the same string.
  • Strings containing letters mixed with parentheses require preserving letters correctly while validating only parentheses.
  • Inputs with many repeated parentheses can create huge duplicate search trees if pruning is not handled carefully.

Approaches

Brute Force Approach

The brute-force strategy is to generate every possible subsequence of the string by deciding for each character whether to keep it or remove it. After generating a candidate string, we check whether it forms a valid parentheses sequence. Among all valid candidates, we keep only those with the maximum length, because maximizing length is equivalent to minimizing removals.

This approach is correct because it explores every possible way to remove characters. Since every valid result appears somewhere in the search space, the algorithm eventually finds all minimum-removal solutions.

The problem is that the number of subsequences grows exponentially. For a string of length n, there are 2^n possible subsequences. Even with n = 25, this becomes too expensive. In addition, validating every generated string requires another linear scan, making the total complexity extremely high.

Optimal Approach

A much better solution uses Breadth-First Search, abbreviated as BFS.

The key observation is that BFS naturally explores states level by level. Each level corresponds to removing exactly one more character than the previous level. This property is extremely valuable because the first time we encounter valid strings, we know they were produced using the minimum number of removals.

Instead of generating every subsequence, we:

  • Start with the original string.
  • Generate neighboring states by removing one parenthesis at each position.
  • Use BFS to explore strings with fewer and fewer parentheses.
  • Stop exploring deeper levels once valid strings are found.

This guarantees minimal removals because BFS processes all states with k removals before considering states with k + 1 removals.

A hash set is used to avoid revisiting duplicate strings.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(2^n × n) O(2^n × n) Generates every subsequence and validates each one
Optimal BFS O(2^n × n) worst case O(2^n × n) Much faster in practice because BFS stops at minimum-removal level

Although the worst-case complexity remains exponential, BFS dramatically reduces unnecessary exploration because it terminates as soon as the minimum-removal solutions are discovered.

Algorithm Walkthrough

BFS Strategy

  1. Start by placing the original string into a queue.

The queue represents states waiting to be processed. Each state is a candidate string formed after some removals. 2. Create a hash set called visited.

Different removal paths can produce the same string. Without deduplication, the search tree would explode exponentially. The visited set ensures every string is processed only once. 3. Define a helper function to validate a string.

The validation logic scans left to right while maintaining a counter:

  • Increment for (.
  • Decrement for ).
  • If the counter ever becomes negative, the string is invalid.
  • At the end, the counter must be zero.
  1. Process strings level by level using BFS.

Every level corresponds to removing one additional character. This is the critical property that guarantees minimal removals. 5. For each string removed from the queue, check whether it is valid.

If it is valid:

  • Add it to the answer list.
  • Mark that a valid level has been found.
  1. Once a valid string is found at the current level, do not generate deeper states.

Since BFS explores by removal count, any deeper level would require more removals and therefore cannot be optimal. 7. If the current string is not valid and no valid string has yet been found at this level, generate neighbors.

Neighbors are created by removing one parenthesis character at every possible position. 8. Skip non-parenthesis characters during removal generation.

Removing letters is never beneficial because letters do not affect validity. 9. Add unseen neighbor strings into the queue and mark them visited.

This continues the BFS exploration efficiently without redundant processing. 10. Return all collected valid strings.

Because BFS stops at the first valid level, every returned string uses the minimum number of removals.

Why it works

The correctness comes from the level-order nature of BFS. Every BFS level corresponds to strings produced after the same number of removals. Therefore, the first level containing valid strings must represent the minimum number of removals required. Since every possible removal combination at that level is explored, all valid minimum-removal strings are found.

Python Solution

from collections import deque
from typing import List

class Solution:
    def removeInvalidParentheses(self, s: str) -> List[str]:
        
        def is_valid(string: str) -> bool:
            balance = 0
            
            for char in string:
                if char == '(':
                    balance += 1
                elif char == ')':
                    balance -= 1
                    
                    if balance < 0:
                        return False
            
            return balance == 0
        
        result = []
        visited = set([s])
        queue = deque([s])
        found = False
        
        while queue:
            current = queue.popleft()
            
            if is_valid(current):
                result.append(current)
                found = True
            
            if found:
                continue
            
            for i in range(len(current)):
                if current[i] not in '()':
                    continue
                
                next_string = current[:i] + current[i + 1:]
                
                if next_string not in visited:
                    visited.add(next_string)
                    queue.append(next_string)
        
        return result

The implementation begins with the is_valid helper function. This function checks whether a string has balanced parentheses using a simple counter. Every opening parenthesis increases the counter, and every closing parenthesis decreases it. If the counter ever becomes negative, a closing parenthesis appeared before a matching opening one, making the string invalid immediately.

The BFS portion uses a queue initialized with the original input string. The visited set prevents duplicate processing.

During each BFS iteration, the current string is validated. If it is valid, it is added to the result list and the found flag becomes True. Once this happens, no deeper levels are explored because deeper levels would involve additional removals.

If no valid string has been found yet, the algorithm generates all neighboring states by removing one parenthesis at each possible position. Letters are skipped because removing them never helps create valid parentheses.

The final result contains every unique valid string requiring the minimum number of removals.

Go Solution

package main

func removeInvalidParentheses(s string) []string {
    
    isValid := func(str string) bool {
        balance := 0
        
        for _, ch := range str {
            if ch == '(' {
                balance++
            } else if ch == ')' {
                balance--
                
                if balance < 0 {
                    return false
                }
            }
        }
        
        return balance == 0
    }
    
    result := []string{}
    visited := make(map[string]bool)
    queue := []string{s}
    
    visited[s] = true
    found := false
    
    for len(queue) > 0 {
        current := queue[0]
        queue = queue[1:]
        
        if isValid(current) {
            result = append(result, current)
            found = true
        }
        
        if found {
            continue
        }
        
        for i := 0; i < len(current); i++ {
            if current[i] != '(' && current[i] != ')' {
                continue
            }
            
            nextString := current[:i] + current[i+1:]
            
            if !visited[nextString] {
                visited[nextString] = true
                queue = append(queue, nextString)
            }
        }
    }
    
    return result
}

The Go implementation follows the same BFS structure as the Python version. Since Go does not have a built-in deque structure like Python's collections.deque, a slice is used as the queue.

The visited structure is implemented using a map from string to boolean. String slicing is used to generate neighbor states after removing one character.

Go strings are immutable, so creating nextString produces a new string safely without modifying the original.

Worked Examples

Example 1

Input:

s = "()())()"

BFS Level 0

Current String Valid?
()())() No

Generate all states with one removal:

Removed Index New String
0 )())()
1 (())()
2 ()))()
3 ()()()
4 ()()()
5 ()()))
6 ()())(

BFS Level 1

Check each generated string:

String Valid?
)())() No
(())() Yes
()))() No
()()() Yes
()())) No
()())( No

The first valid strings appear at this level, so BFS stops.

Output:

["(())()", "()()()"]

Example 2

Input:

s = "(a)())()"

BFS Level 0

Current String Valid?
(a)())() No

Generate neighbors by removing one parenthesis:

New String
a)())()
(a())()
(a))()
(a)()()
(a)()))
(a)())(

BFS Level 1

String Valid?
a)())() No
(a())() Yes
(a))() No
(a)()() Yes
(a)())) No
(a)())( No

Output:

["(a())()", "(a)()()"]

Example 3

Input:

s = ")("

BFS Level 0

Current String Valid?
)( No

Generate neighbors:

New String
(
)

BFS Level 1

Both are invalid.

Generate next level:

New String
""

BFS Level 2

String Valid?
"" Yes

Output:

[""]

Complexity Analysis

Measure Complexity Explanation
Time O(2^n × n) In the worst case, many substrings are explored and each validity check is O(n)
Space O(2^n × n) Queue and visited set may store exponentially many strings

The algorithm is exponential in the worst case because the number of possible removal combinations grows exponentially. However, BFS significantly improves practical performance because it terminates immediately after discovering the minimum-removal level, avoiding unnecessary deeper exploration.

Test Cases

sol = Solution()

# Basic example with two valid answers
assert sorted(sol.removeInvalidParentheses("()())()")) == sorted(["(())()", "()()()"])

# Example with letters included
assert sorted(sol.removeInvalidParentheses("(a)())()")) == sorted(["(a())()", "(a)()()"])

# Completely invalid parentheses
assert sorted(sol.removeInvalidParentheses(")(")) == sorted([""])

# Already valid string
assert sorted(sol.removeInvalidParentheses("()()")) == sorted(["()()"])

# Empty result after removals
assert sorted(sol.removeInvalidParentheses("(((")) == sorted([""])

# No parentheses at all
assert sorted(sol.removeInvalidParentheses("abc")) == sorted(["abc"])

# Single valid pair
assert sorted(sol.removeInvalidParentheses("()")) == sorted(["()"])

# Mixed valid and invalid structure
assert sorted(sol.removeInvalidParentheses("(()")) == sorted(["()"])

# Duplicate removal paths
assert sorted(sol.removeInvalidParentheses("())")) == sorted(["()"])

# Multiple nested possibilities
assert sorted(sol.removeInvalidParentheses("((())")) == sorted(["(())"])

# Complex mixed case
assert sorted(sol.removeInvalidParentheses("(a(b)c)d)")) == sorted(["(a(b)c)d"])

# All closing parentheses
assert sorted(sol.removeInvalidParentheses(")))")) == sorted([""])

Test Summary

Test Why
"()())()" Verifies multiple minimum-removal solutions
"(a)())()" Ensures letters are preserved correctly
")(" Tests completely invalid ordering
"()()" Ensures already valid strings remain unchanged
"(((" Tests removal of all parentheses
"abc" Verifies handling of strings without parentheses
"()" Smallest non-trivial valid case
"(()" Tests unmatched opening parenthesis
"())" Tests unmatched closing parenthesis
"((())" Tests nested imbalance
"(a(b)c)d)" Complex mixed structure with letters
")))" All closing parentheses edge case

Edge Cases

One important edge case is a string that is already valid. A naive implementation might continue removing parentheses unnecessarily and produce shorter valid strings. The BFS solution avoids this because it checks the original string first. If the original string is valid, BFS immediately returns it and never explores deeper levels.

Another important edge case is a string containing only invalid parentheses, such as ")(" or "(((". These cases often expose bugs in validation logic because there may be no valid non-empty solution. The implementation correctly handles this by continuing BFS until the empty string is reached, which is always valid.

A third critical edge case involves duplicate generation paths. For example, the string "())" can produce the same result "()" by removing either of the two closing parentheses. Without a visited set, the algorithm would process duplicate states repeatedly, causing unnecessary work and potentially duplicate outputs. The hash set guarantees that each generated string is explored exactly once.

Another subtle edge case is strings containing letters intermixed with parentheses. Since letters do not affect validity, removing them is unnecessary and wasteful. The implementation skips non-parenthesis characters when generating neighbors, which improves efficiency while preserving correctness.