LeetCode 604 - Design Compressed String Iterator

This problem asks us to implement a compressed string iterator. We are given a string where each character is immediately followed by a number representing how many times that character appears consecutively in the uncompressed version. For example, "a3b2" represents "aaabb".

LeetCode Problem 604

Difficulty: 🟢 Easy
Topics: Array, String, Design, Iterator

Solution

Problem Understanding

This problem asks us to implement a compressed string iterator. We are given a string where each character is immediately followed by a number representing how many times that character appears consecutively in the uncompressed version. For example, "a3b2" represents "aaabb".

We are required to implement a class StringIterator with two methods:

  • next(), which returns the next character in the uncompressed string or a space if there are no characters left.
  • hasNext(), which returns True if there are any remaining characters to iterate over, otherwise False.

The input guarantees that the compressed string is well-formed: letters alternate with numbers, the numbers are positive integers, and each number can be as large as $10^9$. The string length is capped at 1000, but since repetition counts can be extremely large, fully decompressing the string would be infeasible for space and time efficiency.

Important edge cases include:

  • Characters with large repetition counts, which cannot be expanded into a full string in memory.
  • Queries for next() after all characters have been consumed.
  • Strings with single-character repetitions (like "a1").

The problem essentially requires a lazy iterator that decompresses characters on the fly.

Approaches

The brute-force approach is straightforward: fully decompress the string into a list of characters, then iterate over the list. This is correct but impractical because a single character can be repeated up to $10^9$ times, which would exceed memory limits.

The optimal approach is to maintain pointers and counters rather than storing the full string. We parse the compressed string into (char, count) pairs, and then use an index to track the current position and a counter to track how many times the current character has been emitted. Each next() call simply decrements the counter, and when it reaches zero, we move to the next character-count pair. hasNext() simply checks if the counter or remaining characters indicate more characters to return.

Approach Time Complexity Space Complexity Notes
Brute Force O(N + total_count) O(total_count) Decompress the entire string, then iterate. Impractical for large repetitions.
Optimal O(N) for parsing, O(1) per next/hasNext O(N) Only store character-count pairs and maintain pointers, decompress lazily.

Algorithm Walkthrough

  1. Initialization: Parse the compressed string into a list of (char, count) pairs. Use a pointer index to track the current character pair and current_count to track remaining repetitions for that character.
  2. next(): Check if current_count > 0. If so, decrement current_count and return current_char. If current_count == 0, advance index to the next pair and set current_count to the corresponding count before returning the character.
  3. hasNext(): Return True if either current_count > 0 or index is within bounds of the pairs list.
  4. Edge Handling: If next() is called and no characters remain, return a space.

Why it works: The iterator maintains the invariant that current_count always reflects how many times the current_char still needs to be returned. This guarantees correct sequence generation without ever decompressing the entire string.

Python Solution

class StringIterator:

    def __init__(self, compressedString: str):
        self.pairs = []
        i = 0
        n = len(compressedString)
        while i < n:
            char = compressedString[i]
            i += 1
            count = 0
            while i < n and compressedString[i].isdigit():
                count = count * 10 + int(compressedString[i])
                i += 1
            self.pairs.append([char, count])
        self.index = 0
        self.current_count = self.pairs[0][1] if self.pairs else 0

    def next(self) -> str:
        if not self.hasNext():
            return " "
        char, count = self.pairs[self.index]
        self.current_count -= 1
        if self.current_count == 0:
            self.index += 1
            if self.index < len(self.pairs):
                self.current_count = self.pairs[self.index][1]
        return char

    def hasNext(self) -> bool:
        return self.index < len(self.pairs)

The constructor parses the string into (char, count) pairs, handling multi-digit numbers. next() uses current_count to emit characters lazily, and hasNext() ensures we only report remaining characters.

Go Solution

type StringIterator struct {
	pairs        [][2]int
	chars        []byte
	index        int
	currentCount int
}

func Constructor(compressedString string) StringIterator {
	it := StringIterator{}
	n := len(compressedString)
	for i := 0; i < n; {
		char := compressedString[i]
		i++
		count := 0
		for i < n && compressedString[i] >= '0' && compressedString[i] <= '9' {
			count = count*10 + int(compressedString[i]-'0')
			i++
		}
		it.pairs = append(it.pairs, [2]int{int(char), count})
	}
	if len(it.pairs) > 0 {
		it.currentCount = it.pairs[0][1]
	}
	return it
}

func (this *StringIterator) Next() byte {
	if !this.HasNext() {
		return ' '
	}
	char := byte(this.pairs[this.index][0])
	this.currentCount--
	if this.currentCount == 0 {
		this.index++
		if this.index < len(this.pairs) {
			this.currentCount = this.pairs[this.index][1]
		}
	}
	return char
}

func (this *StringIterator) HasNext() bool {
	return this.index < len(this.pairs)
}

In Go, we store (char, count) pairs using a slice of [2]int arrays. We handle multi-digit numbers similarly. Next() and HasNext() behave identically to Python. Go's byte type is used for character representation.

Worked Examples

For "L1e2t1C1o1d1e1":

Step Index Current Count Next Output
init 0 1 -
next() 0 0 -> move to 1 "L"
next() 1 2->1 "e"
next() 1 1->0 -> move to 2 "e"
next() 2 1->0 -> move to 3 "t"
next() 3 1->0 -> move to 4 "C"
next() 4 1->0 -> move to 5 "o"
hasNext() 5 < len(pairs)=6 True -
next() 5 1->0 -> move to 6 "d"
hasNext() 6 < 6? False -

Complexity Analysis

Measure Complexity Explanation
Time O(N) Parsing compressed string takes O(N), next()/hasNext() are O(1) per call.
Space O(N) Only store character-count pairs, not full uncompressed string.

The solution is efficient because it avoids generating a potentially huge string while still allowing fast iteration.

Test Cases

# Provided example
obj = StringIterator("L1e2t1C1o1d1e1")
assert obj.next() == "L"
assert obj.next() == "e"
assert obj.next() == "e"
assert obj.next() == "t"
assert obj.next() == "C"
assert obj.next() == "o"
assert obj.hasNext() == True
assert obj.next() == "d"
assert obj.hasNext() == True
assert obj.next() == "e"
assert obj.hasNext() == False
assert obj.next() == " "  # no more characters

# Edge cases
obj2 = StringIterator("a1")
assert obj2.next() == "a"
assert obj2.hasNext() == False
assert obj2.next() == " "

obj3 = StringIterator("x10")
for _ in range(10):
    assert obj3.next() == "x"
assert obj3.hasNext() == False
Test Why
"L1e2t1C1o1d1e1" Validates typical mixed letters and counts
"a1" Single-character string
"x10" Large repetition to test lazy iteration

Edge Cases

  1. Single-character compressed string: "a1" or "Z100" ensures the iterator correctly handles a string with only one type of character and returns the correct hasNext() status after consumption. The implementation handles this using current_count and moving index.
  2. Large counts: "x1000000000" tests that we never expand the full string, as counts can reach $10^9$. Our solution maintains a counter instead of expanding the string, so it handles