LeetCode 604 - Design Compressed String Iterator
This problem asks us to implement a compressed string iterator. We are given a string where each character is immediately followed by a number representing how many times that character appears consecutively in the uncompressed version. For example, "a3b2" represents "aaabb".
Difficulty: 🟢 Easy
Topics: Array, String, Design, Iterator
Solution
Problem Understanding
This problem asks us to implement a compressed string iterator. We are given a string where each character is immediately followed by a number representing how many times that character appears consecutively in the uncompressed version. For example, "a3b2" represents "aaabb".
We are required to implement a class StringIterator with two methods:
next(), which returns the next character in the uncompressed string or a space if there are no characters left.hasNext(), which returnsTrueif there are any remaining characters to iterate over, otherwiseFalse.
The input guarantees that the compressed string is well-formed: letters alternate with numbers, the numbers are positive integers, and each number can be as large as $10^9$. The string length is capped at 1000, but since repetition counts can be extremely large, fully decompressing the string would be infeasible for space and time efficiency.
Important edge cases include:
- Characters with large repetition counts, which cannot be expanded into a full string in memory.
- Queries for
next()after all characters have been consumed. - Strings with single-character repetitions (like
"a1").
The problem essentially requires a lazy iterator that decompresses characters on the fly.
Approaches
The brute-force approach is straightforward: fully decompress the string into a list of characters, then iterate over the list. This is correct but impractical because a single character can be repeated up to $10^9$ times, which would exceed memory limits.
The optimal approach is to maintain pointers and counters rather than storing the full string. We parse the compressed string into (char, count) pairs, and then use an index to track the current position and a counter to track how many times the current character has been emitted. Each next() call simply decrements the counter, and when it reaches zero, we move to the next character-count pair. hasNext() simply checks if the counter or remaining characters indicate more characters to return.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(N + total_count) | O(total_count) | Decompress the entire string, then iterate. Impractical for large repetitions. |
| Optimal | O(N) for parsing, O(1) per next/hasNext | O(N) | Only store character-count pairs and maintain pointers, decompress lazily. |
Algorithm Walkthrough
- Initialization: Parse the compressed string into a list of
(char, count)pairs. Use a pointerindexto track the current character pair andcurrent_countto track remaining repetitions for that character. - next(): Check if
current_count > 0. If so, decrementcurrent_countand returncurrent_char. Ifcurrent_count == 0, advanceindexto the next pair and setcurrent_countto the corresponding count before returning the character. - hasNext(): Return
Trueif eithercurrent_count > 0orindexis within bounds of the pairs list. - Edge Handling: If
next()is called and no characters remain, return a space.
Why it works: The iterator maintains the invariant that current_count always reflects how many times the current_char still needs to be returned. This guarantees correct sequence generation without ever decompressing the entire string.
Python Solution
class StringIterator:
def __init__(self, compressedString: str):
self.pairs = []
i = 0
n = len(compressedString)
while i < n:
char = compressedString[i]
i += 1
count = 0
while i < n and compressedString[i].isdigit():
count = count * 10 + int(compressedString[i])
i += 1
self.pairs.append([char, count])
self.index = 0
self.current_count = self.pairs[0][1] if self.pairs else 0
def next(self) -> str:
if not self.hasNext():
return " "
char, count = self.pairs[self.index]
self.current_count -= 1
if self.current_count == 0:
self.index += 1
if self.index < len(self.pairs):
self.current_count = self.pairs[self.index][1]
return char
def hasNext(self) -> bool:
return self.index < len(self.pairs)
The constructor parses the string into (char, count) pairs, handling multi-digit numbers. next() uses current_count to emit characters lazily, and hasNext() ensures we only report remaining characters.
Go Solution
type StringIterator struct {
pairs [][2]int
chars []byte
index int
currentCount int
}
func Constructor(compressedString string) StringIterator {
it := StringIterator{}
n := len(compressedString)
for i := 0; i < n; {
char := compressedString[i]
i++
count := 0
for i < n && compressedString[i] >= '0' && compressedString[i] <= '9' {
count = count*10 + int(compressedString[i]-'0')
i++
}
it.pairs = append(it.pairs, [2]int{int(char), count})
}
if len(it.pairs) > 0 {
it.currentCount = it.pairs[0][1]
}
return it
}
func (this *StringIterator) Next() byte {
if !this.HasNext() {
return ' '
}
char := byte(this.pairs[this.index][0])
this.currentCount--
if this.currentCount == 0 {
this.index++
if this.index < len(this.pairs) {
this.currentCount = this.pairs[this.index][1]
}
}
return char
}
func (this *StringIterator) HasNext() bool {
return this.index < len(this.pairs)
}
In Go, we store (char, count) pairs using a slice of [2]int arrays. We handle multi-digit numbers similarly. Next() and HasNext() behave identically to Python. Go's byte type is used for character representation.
Worked Examples
For "L1e2t1C1o1d1e1":
| Step | Index | Current Count | Next Output |
|---|---|---|---|
| init | 0 | 1 | - |
| next() | 0 | 0 -> move to 1 | "L" |
| next() | 1 | 2->1 | "e" |
| next() | 1 | 1->0 -> move to 2 | "e" |
| next() | 2 | 1->0 -> move to 3 | "t" |
| next() | 3 | 1->0 -> move to 4 | "C" |
| next() | 4 | 1->0 -> move to 5 | "o" |
| hasNext() | 5 < len(pairs)=6 | True | - |
| next() | 5 | 1->0 -> move to 6 | "d" |
| hasNext() | 6 < 6? | False | - |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(N) | Parsing compressed string takes O(N), next()/hasNext() are O(1) per call. |
| Space | O(N) | Only store character-count pairs, not full uncompressed string. |
The solution is efficient because it avoids generating a potentially huge string while still allowing fast iteration.
Test Cases
# Provided example
obj = StringIterator("L1e2t1C1o1d1e1")
assert obj.next() == "L"
assert obj.next() == "e"
assert obj.next() == "e"
assert obj.next() == "t"
assert obj.next() == "C"
assert obj.next() == "o"
assert obj.hasNext() == True
assert obj.next() == "d"
assert obj.hasNext() == True
assert obj.next() == "e"
assert obj.hasNext() == False
assert obj.next() == " " # no more characters
# Edge cases
obj2 = StringIterator("a1")
assert obj2.next() == "a"
assert obj2.hasNext() == False
assert obj2.next() == " "
obj3 = StringIterator("x10")
for _ in range(10):
assert obj3.next() == "x"
assert obj3.hasNext() == False
| Test | Why |
|---|---|
"L1e2t1C1o1d1e1" |
Validates typical mixed letters and counts |
"a1" |
Single-character string |
"x10" |
Large repetition to test lazy iteration |
Edge Cases
- Single-character compressed string:
"a1"or"Z100"ensures the iterator correctly handles a string with only one type of character and returns the correcthasNext()status after consumption. The implementation handles this usingcurrent_countand movingindex. - Large counts:
"x1000000000"tests that we never expand the full string, as counts can reach $10^9$. Our solution maintains a counter instead of expanding the string, so it handles