LeetCode 722 - Remove Comments
The problem asks us to remove comments from a C++ program represented as an array of strings. Each string represents a line of the program, split by newline characters.
Difficulty: 🟡 Medium
Topics: Array, String
Solution
Problem Understanding
The problem asks us to remove comments from a C++ program represented as an array of strings. Each string represents a line of the program, split by newline characters. In C++, there are two types of comments: line comments starting with "//" and block comments enclosed between "/*" and "*/". Line comments terminate at the end of the line, while block comments may span multiple lines.
Our task is to process the input line by line and remove all comments. After removing comments, lines that are empty should not be included in the output. The output must retain the relative order of the remaining code lines.
The input constraints are manageable: the source code has at most 100 lines, each up to 80 characters. Every block comment is guaranteed to be closed, and there are no quotes in the input. These constraints imply that we do not need to handle malformed comments or embedded strings, which simplifies parsing. Key edge cases include comments that start and end in the middle of lines, multiple comments in a single line, and block comments spanning multiple lines. We also need to correctly handle implicit newline removals caused by block comments.
Approaches
A brute-force approach would be to convert the source code into a single string, search for all occurrences of "//" and "/*...*/", and remove them. After removing comments, we would split the string back into lines. This approach works, but it is cumbersome because handling multiline block comments and line breaks manually is error-prone. Additionally, string concatenation and repeated searches could make the solution less efficient, though the input size is small.
The optimal approach uses a line-by-line scan with a flag to track whether we are inside a block comment. For each line, we iterate through characters while checking for "//" and "/*" or "*/" sequences. If we encounter a line comment outside a block comment, we ignore the rest of the line. If we encounter a block comment start, we enter block comment mode, skipping characters until we find the end of the block comment. This approach preserves line structure and avoids unnecessary string operations.
This approach is linear in the total number of characters in the input, making it efficient for the given constraints.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(N * M) | O(N * M) | Concatenate lines, remove all comments via string replacement, then split back |
| Optimal | O(N * M) | O(N * M) | Single pass through characters, tracking block comments, preserving lines |
Here, N is the number of lines, and M is the average line length.
Algorithm Walkthrough
- Initialize an empty list
resultto store cleaned lines and a boolean flagin_block_commentset toFalse. - Iterate through each line in the input source.
- For each line, initialize an empty list
new_lineto accumulate non-comment characters and a pointerito track the current character index. - While
iis within the bounds of the line:
- If
in_block_commentisTrue, check if the current and next character are"*/". If yes, exit block comment mode and advanceiby 2. Otherwise, skip the current character. - If
in_block_commentisFalse, check if the current and next character are"//". If yes, break the loop since the rest of the line is a line comment. - If the current and next character are
"/*", enter block comment mode and advanceiby 2. - Otherwise, append the current character to
new_lineand advanceiby 1.
- After processing the line, if
new_lineis non-empty and not entirely whitespace, join it into a string and append toresult. - Return the
resultlist.
Why it works: The invariant is that all characters outside block comments are accumulated, and all comments are skipped. Line and block comment rules are followed exactly, ensuring that only valid code is retained.
Python Solution
from typing import List
class Solution:
def removeComments(self, source: List[str]) -> List[str]:
result = []
in_block_comment = False
new_line = []
for line in source:
i = 0
if not in_block_comment:
new_line = []
while i < len(line):
if in_block_comment:
if i + 1 < len(line) and line[i] == '*' and line[i+1] == '/':
in_block_comment = False
i += 2
else:
i += 1
else:
if i + 1 < len(line) and line[i] == '/' and line[i+1] == '/':
break
elif i + 1 < len(line) and line[i] == '/' and line[i+1] == '*':
in_block_comment = True
i += 2
else:
new_line.append(line[i])
i += 1
if not in_block_comment and new_line:
result.append(''.join(new_line))
return result
This Python implementation directly mirrors the algorithm. The main loop handles line traversal, while in_block_comment tracks block comment state. Characters are appended only if outside a comment, ensuring correct line construction.
Go Solution
func removeComments(source []string) []string {
result := []string{}
inBlock := false
var newLine []byte
for _, line := range source {
i := 0
if !inBlock {
newLine = []byte{}
}
for i < len(line) {
if inBlock {
if i+1 < len(line) && line[i] == '*' && line[i+1] == '/' {
inBlock = false
i += 2
} else {
i++
}
} else {
if i+1 < len(line) && line[i] == '/' && line[i+1] == '/' {
break
} else if i+1 < len(line) && line[i] == '/' && line[i+1] == '*' {
inBlock = true
i += 2
} else {
newLine = append(newLine, line[i])
i++
}
}
}
if !inBlock && len(newLine) > 0 {
result = append(result, string(newLine))
}
}
return result
}
The Go solution is almost identical to Python. Slices replace Python lists for character accumulation. newLine is a byte slice to efficiently build strings. The main logic remains unchanged.
Worked Examples
Example 1
Input: ["/*Test program */", "int main()", "{ ", " // variable declaration ", "int a, b, c;", "/* This is a test", " multiline ", " comment for ", " testing */", "a = b + c;", "}"]
Step-by-step:
| Line | in_block_comment | new_line | Action |
|---|---|---|---|
"/*Test program */" |
False→True→False | [] | Block comment removed |
"int main()" |
False | ['i','n','t',' ','m','a','i','n','(',')'] | Appended to result |
"{ " |
False | ['{',' '] | Appended |
" // variable declaration " |
False | [' ',' '] | Stops at "//", append [' ',' '] |
"int a, b, c;" |
False | ['i','n','t',' ','a',',',' ','b',',',' ','c',';'] | Append |
"/* This is a test" |
False→True | [] | Enter block comment |
" multiline " |
True | [] | Inside block comment, skip |
" comment for " |
True | [] | Skip |
" testing */" |
True→False | [] | End block comment |
"a = b + c;" |
False | ['a',' ','=',' ','b',' ','+',' ','c',';'] | Append |
"}" |
False | ['}'] | Append |
Output: ["int main()","{ "," ","int a, b, c;","a = b + c;","}"]
Example 2
Input: ["a/*comment", "line", "more_comment*/b"]
Step-by-step:
| Line | in_block_comment | new_line | Action |
|---|---|---|---|
"a/*comment" |
False→True | ['a'] | Enter block comment |
"line" |
True | [] | Skip |
"more_comment*/b" |
True→False | ['b'] | Exit block comment and append 'b' |
Output: ["ab"]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(N * M) | Each character of each line is visited exactly once |
| Space | O(N * M) | Accumulate non-comment characters into the result list |
The solution is linear with respect to the total number of characters, which is optimal for this problem. Using a single pass ensures we never re-process lines or characters unnecessarily.
Test Cases
# Provided examples
assert Solution().removeComments(["/*Test program */", "int main()", "{ ", " // variable declaration ", "int a, b, c;", "/* This is a test", " multiline ", " comment