LeetCode 900 - RLE Iterator
This problem asks us to design an iterator over a run-length encoded sequence instead of storing the full sequence explicitly.
Difficulty: 🟡 Medium
Topics: Array, Design, Counting, Iterator
Solution
Problem Understanding
This problem asks us to design an iterator over a run-length encoded sequence instead of storing the full sequence explicitly.
In run-length encoding, the array is organized in pairs:
encoding[i]represents how many times a value appearsencoding[i + 1]represents the value itself
For example:
[3,8,2,5]
means:
8 appears 3 times
5 appears 2 times
So the decoded sequence becomes:
[8,8,8,5,5]
The iterator must support repeated calls to:
next(n)
This operation consumes the next n elements from the sequence and returns the last exhausted value.
For example, if the remaining sequence is:
[8,8,5,5]
and we call:
next(3)
then the exhausted elements are:
8,8,5
The returned value is:
5
and the remaining sequence becomes:
[5]
If there are not enough elements left to exhaust n items, we return -1.
The constraints are important:
- Counts can be as large as
10^9 ncan also be as large as10^9- There can be up to
1000calls tonext
These limits immediately tell us that explicitly expanding the sequence is not practical. A sequence like:
[1000000000, 7]
would require storing one billion elements if fully decompressed, which is impossible within normal memory limits.
The key observation is that we never actually need the expanded sequence. We only need to track:
- which run we are currently consuming
- how many elements remain in that run
Several edge cases are important:
- Runs with count
0 - Calls where
nspans multiple runs - Calls that exactly finish a run
- Calls that ask for more elements than remain
- Very large counts and very large
nvalues
The problem guarantees:
- The encoding length is even
- Values are non-negative
nextwill only receive positiven
This allows us to safely process the array in count-value pairs.
Approaches
Brute Force Approach
The most straightforward idea is to fully decode the run-length encoded array into the actual sequence.
For example:
[3,8,2,5]
would become:
[8,8,8,5,5]
Then the iterator can maintain a pointer into this expanded array. Each next(n) call simply advances the pointer by n positions and returns the last consumed element.
This approach is easy to understand and implement. Since the sequence is explicitly stored, answering queries becomes simple pointer arithmetic.
However, this solution is completely impractical for the given constraints. Counts may be as large as 10^9, which means the expanded sequence could contain billions of elements. Both memory usage and construction time become impossible.
Optimal Approach
The key insight is that we never need the fully expanded sequence.
Instead, we can process the encoding directly. Since the encoding already tells us how many times each value appears, we can consume counts mathematically.
We maintain:
- a pointer to the current run
- the remaining count in that run
When next(n) is called:
- If the current run has fewer than
nelements remaining, we consume the entire run and move to the next one. - Otherwise, we consume
nelements from the current run and return its value.
This works because the encoded structure already contains all information needed to simulate exhaustion without materializing the sequence.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(total decoded length) | O(total decoded length) | Fully expands the sequence, impossible for large counts |
| Optimal | O(number of runs processed) per call | O(1) extra space | Consumes runs directly without decompression |
Algorithm Walkthrough
- Store the encoded array and initialize a pointer.
The pointer always points to the current count position in the encoding array. Since the encoding uses pairs:
[count, value, count, value, ...]
the pointer moves in steps of two.
2. During each next(n) call, repeatedly process the current run.
At each step:
encoding[index]is the remaining countencoding[index + 1]is the value
- If the current run contains fewer elements than needed:
remaining_count < n
then consume the entire run.
Subtract that count from n because we still need more elements after exhausting this run.
Move the pointer to the next pair. 4. If the current run contains enough elements:
remaining_count >= n
then reduce the remaining count by n.
The last exhausted element must be the current value, so return it immediately.
5. If all runs are exhausted before consuming n elements, return -1.
Why it works
The algorithm maintains the invariant that all runs before the current pointer are completely exhausted, while the current run accurately stores the remaining unused count.
Each next(n) call consumes exactly the requested number of elements whenever possible. Since runs are processed in order and counts are updated correctly, the returned value is always the final exhausted element. If insufficient elements remain, the algorithm correctly detects exhaustion and returns -1.
Python Solution
from typing import List
class RLEIterator:
def __init__(self, encoding: List[int]):
self.encoding = encoding
self.index = 0
def next(self, n: int) -> int:
while self.index < len(self.encoding):
current_count = self.encoding[self.index]
current_value = self.encoding[self.index + 1]
if current_count < n:
n -= current_count
self.index += 2
else:
self.encoding[self.index] -= n
return current_value
return -1
# Your RLEIterator object will be instantiated and called as such:
# obj = RLEIterator(encoding)
# param_1 = obj.next(n)
The constructor stores the encoded array and initializes a pointer named index.
The pointer always references the current count position inside the encoding array. Since the structure uses count-value pairs, the associated value is always at index + 1.
Inside next, we repeatedly inspect the current run.
If the current run does not contain enough elements, we consume the entire run:
n -= current_count
self.index += 2
This moves us to the next count-value pair.
If the current run contains enough elements, we partially consume it by subtracting n from the remaining count:
self.encoding[self.index] -= n
The current value is then the final exhausted element, so we return it.
If all runs are exhausted before consuming all required elements, the loop ends and we return -1.
Go Solution
type RLEIterator struct {
encoding []int
index int
}
func Constructor(encoding []int) RLEIterator {
return RLEIterator{
encoding: encoding,
index: 0,
}
}
func (this *RLEIterator) Next(n int) int {
for this.index < len(this.encoding) {
currentCount := this.encoding[this.index]
currentValue := this.encoding[this.index+1]
if currentCount < n {
n -= currentCount
this.index += 2
} else {
this.encoding[this.index] -= n
return currentValue
}
}
return -1
}
/**
* Your RLEIterator object will be instantiated and called as such:
* obj := Constructor(encoding);
* param_1 := obj.Next(n);
*/
The Go implementation follows exactly the same logic as the Python solution.
One notable difference is that Go uses struct fields explicitly:
encoding []int
index int
The iterator updates the original encoding slice in place to track remaining counts.
Integer overflow is not a concern because Go's int type safely handles values up to at least 10^9 on standard LeetCode environments.
Worked Examples
Example 1
Input:
encoding = [3,8,0,9,2,5]
This represents:
[8,8,8,5,5]
Initial state:
| Index | Count | Value |
|---|---|---|
| 0 | 3 | 8 |
| 2 | 0 | 9 |
| 4 | 2 | 5 |
Call 1: next(2)
Current run:
3 copies of 8
Since 3 >= 2:
- consume 2 elements
- remaining count becomes 1
Updated encoding:
[1,8,0,9,2,5]
Return:
8
Call 2: next(1)
Current run:
1 copy of 8
Since 1 >= 1:
- consume 1 element
- remaining count becomes 0
Updated encoding:
[0,8,0,9,2,5]
Return:
8
Call 3: next(1)
Current run has count 0, so skip it.
Move to:
2 copies of 5
Since 2 >= 1:
- consume 1 element
- remaining count becomes 1
Updated encoding:
[0,8,0,9,1,5]
Return:
5
Call 4: next(2)
Current run:
1 copy of 5
Consume entire run:
n = 2 - 1 = 1
No runs remain.
Return:
-1
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(k) per call | k is the number of runs traversed during the call |
| Space | O(1) | Only a pointer and a few variables are used |
The iterator never expands the encoded sequence, which avoids enormous memory usage.
Each run is skipped at most once across all calls because the pointer only moves forward. Since the encoding length is at most 1000, the total work remains efficient.
Test Cases
from typing import List
class RLEIterator:
def __init__(self, encoding: List[int]):
self.encoding = encoding
self.index = 0
def next(self, n: int) -> int:
while self.index < len(self.encoding):
current_count = self.encoding[self.index]
current_value = self.encoding[self.index + 1]
if current_count < n:
n -= current_count
self.index += 2
else:
self.encoding[self.index] -= n
return current_value
return -1
# Provided example
obj = RLEIterator([3, 8, 0, 9, 2, 5])
assert obj.next(2) == 8 # consume first two 8s
assert obj.next(1) == 8 # consume last 8
assert obj.next(1) == 5 # consume first 5
assert obj.next(2) == -1 # not enough elements remain
# Single run
obj = RLEIterator([5, 10])
assert obj.next(1) == 10 # partial consumption
assert obj.next(4) == 10 # exact exhaustion
assert obj.next(1) == -1 # already empty
# Zero-count runs
obj = RLEIterator([0, 1, 3, 2])
assert obj.next(1) == 2 # skip zero-count run
assert obj.next(2) == 2 # consume remaining
assert obj.next(1) == -1 # exhausted
# Multiple run transitions
obj = RLEIterator([2, 1, 2, 2, 2, 3])
assert obj.next(3) == 2 # crosses from 1s into 2s
assert obj.next(2) == 3 # consumes 3s
assert obj.next(1) == -1 # no elements left
# Large counts
obj = RLEIterator([1000000000, 7])
assert obj.next(999999999) == 7 # large consumption
assert obj.next(1) == 7 # exact finish
assert obj.next(1) == -1 # exhausted
# Exact boundary transitions
obj = RLEIterator([1, 4, 1, 5])
assert obj.next(1) == 4 # exhaust first run exactly
assert obj.next(1) == 5 # move to next run
assert obj.next(1) == -1 # exhausted
| Test | Why |
|---|---|
| Provided example | Validates the main iterator behavior |
| Single run | Tests exact exhaustion within one run |
| Zero-count runs | Ensures empty runs are skipped correctly |
| Multiple run transitions | Tests crossing between runs |
| Large counts | Verifies no decompression occurs |
| Exact boundary transitions | Tests pointer movement correctness |
Edge Cases
Zero-count runs
An encoding may contain entries like:
[0, 9]
This means the value 9 appears zero times and should effectively be ignored.
A buggy implementation might accidentally return 9 or fail to advance correctly. This implementation handles the case naturally because when the count is smaller than n, the run is skipped immediately.
Exhausting exactly at the boundary
Consider:
[3, 8]
and:
next(3)
The implementation must correctly return 8 and leave the iterator fully exhausted afterward.
This can be a source of off-by-one errors. The solution avoids this by subtracting exactly n from the current count before returning.
Requests larger than remaining elements
Suppose only one element remains but the call is:
next(2)
The iterator must return -1, even though one valid element was partially consumed.
A naive implementation might incorrectly return the last available value. This solution correctly tracks the remaining required amount and only returns a value if all n elements can be exhausted successfully.
Very large counts
Counts can reach 10^9.
Any implementation that expands the sequence into a list or array will run out of memory or time. This solution never stores the decoded sequence and only manipulates counts mathematically, allowing it to handle extremely large inputs efficiently.