LeetCode 900 - RLE Iterator

This problem asks us to design an iterator over a run-length encoded sequence instead of storing the full sequence explicitly.

LeetCode Problem 900

Difficulty: 🟡 Medium
Topics: Array, Design, Counting, Iterator

Solution

Problem Understanding

This problem asks us to design an iterator over a run-length encoded sequence instead of storing the full sequence explicitly.

In run-length encoding, the array is organized in pairs:

  • encoding[i] represents how many times a value appears
  • encoding[i + 1] represents the value itself

For example:

[3,8,2,5]

means:

8 appears 3 times
5 appears 2 times

So the decoded sequence becomes:

[8,8,8,5,5]

The iterator must support repeated calls to:

next(n)

This operation consumes the next n elements from the sequence and returns the last exhausted value.

For example, if the remaining sequence is:

[8,8,5,5]

and we call:

next(3)

then the exhausted elements are:

8,8,5

The returned value is:

5

and the remaining sequence becomes:

[5]

If there are not enough elements left to exhaust n items, we return -1.

The constraints are important:

  • Counts can be as large as 10^9
  • n can also be as large as 10^9
  • There can be up to 1000 calls to next

These limits immediately tell us that explicitly expanding the sequence is not practical. A sequence like:

[1000000000, 7]

would require storing one billion elements if fully decompressed, which is impossible within normal memory limits.

The key observation is that we never actually need the expanded sequence. We only need to track:

  • which run we are currently consuming
  • how many elements remain in that run

Several edge cases are important:

  • Runs with count 0
  • Calls where n spans multiple runs
  • Calls that exactly finish a run
  • Calls that ask for more elements than remain
  • Very large counts and very large n values

The problem guarantees:

  • The encoding length is even
  • Values are non-negative
  • next will only receive positive n

This allows us to safely process the array in count-value pairs.

Approaches

Brute Force Approach

The most straightforward idea is to fully decode the run-length encoded array into the actual sequence.

For example:

[3,8,2,5]

would become:

[8,8,8,5,5]

Then the iterator can maintain a pointer into this expanded array. Each next(n) call simply advances the pointer by n positions and returns the last consumed element.

This approach is easy to understand and implement. Since the sequence is explicitly stored, answering queries becomes simple pointer arithmetic.

However, this solution is completely impractical for the given constraints. Counts may be as large as 10^9, which means the expanded sequence could contain billions of elements. Both memory usage and construction time become impossible.

Optimal Approach

The key insight is that we never need the fully expanded sequence.

Instead, we can process the encoding directly. Since the encoding already tells us how many times each value appears, we can consume counts mathematically.

We maintain:

  • a pointer to the current run
  • the remaining count in that run

When next(n) is called:

  • If the current run has fewer than n elements remaining, we consume the entire run and move to the next one.
  • Otherwise, we consume n elements from the current run and return its value.

This works because the encoded structure already contains all information needed to simulate exhaustion without materializing the sequence.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(total decoded length) O(total decoded length) Fully expands the sequence, impossible for large counts
Optimal O(number of runs processed) per call O(1) extra space Consumes runs directly without decompression

Algorithm Walkthrough

  1. Store the encoded array and initialize a pointer.

The pointer always points to the current count position in the encoding array. Since the encoding uses pairs:

[count, value, count, value, ...]

the pointer moves in steps of two. 2. During each next(n) call, repeatedly process the current run.

At each step:

  • encoding[index] is the remaining count
  • encoding[index + 1] is the value
  1. If the current run contains fewer elements than needed:
remaining_count < n

then consume the entire run.

Subtract that count from n because we still need more elements after exhausting this run.

Move the pointer to the next pair. 4. If the current run contains enough elements:

remaining_count >= n

then reduce the remaining count by n.

The last exhausted element must be the current value, so return it immediately. 5. If all runs are exhausted before consuming n elements, return -1.

Why it works

The algorithm maintains the invariant that all runs before the current pointer are completely exhausted, while the current run accurately stores the remaining unused count.

Each next(n) call consumes exactly the requested number of elements whenever possible. Since runs are processed in order and counts are updated correctly, the returned value is always the final exhausted element. If insufficient elements remain, the algorithm correctly detects exhaustion and returns -1.

Python Solution

from typing import List

class RLEIterator:

    def __init__(self, encoding: List[int]):
        self.encoding = encoding
        self.index = 0

    def next(self, n: int) -> int:
        while self.index < len(self.encoding):

            current_count = self.encoding[self.index]
            current_value = self.encoding[self.index + 1]

            if current_count < n:
                n -= current_count
                self.index += 2
            else:
                self.encoding[self.index] -= n
                return current_value

        return -1

# Your RLEIterator object will be instantiated and called as such:
# obj = RLEIterator(encoding)
# param_1 = obj.next(n)

The constructor stores the encoded array and initializes a pointer named index.

The pointer always references the current count position inside the encoding array. Since the structure uses count-value pairs, the associated value is always at index + 1.

Inside next, we repeatedly inspect the current run.

If the current run does not contain enough elements, we consume the entire run:

n -= current_count
self.index += 2

This moves us to the next count-value pair.

If the current run contains enough elements, we partially consume it by subtracting n from the remaining count:

self.encoding[self.index] -= n

The current value is then the final exhausted element, so we return it.

If all runs are exhausted before consuming all required elements, the loop ends and we return -1.

Go Solution

type RLEIterator struct {
	encoding []int
	index    int
}

func Constructor(encoding []int) RLEIterator {
	return RLEIterator{
		encoding: encoding,
		index:    0,
	}
}

func (this *RLEIterator) Next(n int) int {

	for this.index < len(this.encoding) {

		currentCount := this.encoding[this.index]
		currentValue := this.encoding[this.index+1]

		if currentCount < n {
			n -= currentCount
			this.index += 2
		} else {
			this.encoding[this.index] -= n
			return currentValue
		}
	}

	return -1
}

/**
 * Your RLEIterator object will be instantiated and called as such:
 * obj := Constructor(encoding);
 * param_1 := obj.Next(n);
 */

The Go implementation follows exactly the same logic as the Python solution.

One notable difference is that Go uses struct fields explicitly:

encoding []int
index int

The iterator updates the original encoding slice in place to track remaining counts.

Integer overflow is not a concern because Go's int type safely handles values up to at least 10^9 on standard LeetCode environments.

Worked Examples

Example 1

Input:

encoding = [3,8,0,9,2,5]

This represents:

[8,8,8,5,5]

Initial state:

Index Count Value
0 3 8
2 0 9
4 2 5

Call 1: next(2)

Current run:

3 copies of 8

Since 3 >= 2:

  • consume 2 elements
  • remaining count becomes 1

Updated encoding:

[1,8,0,9,2,5]

Return:

8

Call 2: next(1)

Current run:

1 copy of 8

Since 1 >= 1:

  • consume 1 element
  • remaining count becomes 0

Updated encoding:

[0,8,0,9,2,5]

Return:

8

Call 3: next(1)

Current run has count 0, so skip it.

Move to:

2 copies of 5

Since 2 >= 1:

  • consume 1 element
  • remaining count becomes 1

Updated encoding:

[0,8,0,9,1,5]

Return:

5

Call 4: next(2)

Current run:

1 copy of 5

Consume entire run:

n = 2 - 1 = 1

No runs remain.

Return:

-1

Complexity Analysis

Measure Complexity Explanation
Time O(k) per call k is the number of runs traversed during the call
Space O(1) Only a pointer and a few variables are used

The iterator never expands the encoded sequence, which avoids enormous memory usage.

Each run is skipped at most once across all calls because the pointer only moves forward. Since the encoding length is at most 1000, the total work remains efficient.

Test Cases

from typing import List

class RLEIterator:

    def __init__(self, encoding: List[int]):
        self.encoding = encoding
        self.index = 0

    def next(self, n: int) -> int:
        while self.index < len(self.encoding):

            current_count = self.encoding[self.index]
            current_value = self.encoding[self.index + 1]

            if current_count < n:
                n -= current_count
                self.index += 2
            else:
                self.encoding[self.index] -= n
                return current_value

        return -1

# Provided example
obj = RLEIterator([3, 8, 0, 9, 2, 5])
assert obj.next(2) == 8   # consume first two 8s
assert obj.next(1) == 8   # consume last 8
assert obj.next(1) == 5   # consume first 5
assert obj.next(2) == -1  # not enough elements remain

# Single run
obj = RLEIterator([5, 10])
assert obj.next(1) == 10  # partial consumption
assert obj.next(4) == 10  # exact exhaustion
assert obj.next(1) == -1  # already empty

# Zero-count runs
obj = RLEIterator([0, 1, 3, 2])
assert obj.next(1) == 2   # skip zero-count run
assert obj.next(2) == 2   # consume remaining
assert obj.next(1) == -1  # exhausted

# Multiple run transitions
obj = RLEIterator([2, 1, 2, 2, 2, 3])
assert obj.next(3) == 2   # crosses from 1s into 2s
assert obj.next(2) == 3   # consumes 3s
assert obj.next(1) == -1  # no elements left

# Large counts
obj = RLEIterator([1000000000, 7])
assert obj.next(999999999) == 7  # large consumption
assert obj.next(1) == 7          # exact finish
assert obj.next(1) == -1         # exhausted

# Exact boundary transitions
obj = RLEIterator([1, 4, 1, 5])
assert obj.next(1) == 4   # exhaust first run exactly
assert obj.next(1) == 5   # move to next run
assert obj.next(1) == -1  # exhausted
Test Why
Provided example Validates the main iterator behavior
Single run Tests exact exhaustion within one run
Zero-count runs Ensures empty runs are skipped correctly
Multiple run transitions Tests crossing between runs
Large counts Verifies no decompression occurs
Exact boundary transitions Tests pointer movement correctness

Edge Cases

Zero-count runs

An encoding may contain entries like:

[0, 9]

This means the value 9 appears zero times and should effectively be ignored.

A buggy implementation might accidentally return 9 or fail to advance correctly. This implementation handles the case naturally because when the count is smaller than n, the run is skipped immediately.

Exhausting exactly at the boundary

Consider:

[3, 8]

and:

next(3)

The implementation must correctly return 8 and leave the iterator fully exhausted afterward.

This can be a source of off-by-one errors. The solution avoids this by subtracting exactly n from the current count before returning.

Requests larger than remaining elements

Suppose only one element remains but the call is:

next(2)

The iterator must return -1, even though one valid element was partially consumed.

A naive implementation might incorrectly return the last available value. This solution correctly tracks the remaining required amount and only returns a value if all n elements can be exhausted successfully.

Very large counts

Counts can reach 10^9.

Any implementation that expands the sequence into a list or array will run out of memory or time. This solution never stores the decoded sequence and only manipulates counts mathematically, allowing it to handle extremely large inputs efficiently.