LeetCode 158 - Read N Characters Given read4 II - Call Multiple Times

This problem asks us to implement a read(buf, n) function using a restricted API called read4(buf4). The challenge is that we are not allowed to access the file directly, instead, we can only retrieve data in chunks of up to four characters at a time.

LeetCode Problem 158

Difficulty: 🔴 Hard
Topics: Array, Simulation, Interactive

Solution

Problem Understanding

This problem asks us to implement a read(buf, n) function using a restricted API called read4(buf4). The challenge is that we are not allowed to access the file directly, instead, we can only retrieve data in chunks of up to four characters at a time.

At first glance, this may look very similar to LeetCode 157, but there is one crucial difference: the read() function may be called multiple times. That changes the problem significantly because read4() can over-read data that is not immediately needed.

To understand why this matters, consider a file "abcdef" and a call read(buf, 3):

  • read4() returns "abcd"
  • We only need "abc"
  • The extra character "d" must not be lost

If read() is called again later, for example read(buf, 2), we must return "de", meaning the leftover "d" from the previous call must still be available.

In other words, this problem requires us to maintain persistent internal state across multiple invocations of read().

The input conceptually consists of:

  • A hidden file stream
  • A sequence of read(n) calls
  • A destination buffer buf where characters are written

The expected output for each call is the number of characters actually read. This may be smaller than n if the end of the file is reached.

The constraints are relatively small, with file lengths up to 500 and at most 10 queries. This means performance is not extremely demanding. However, the challenge is not about scale, it is about correctly handling leftover characters between calls.

Several edge cases can easily break a naive implementation. A request may ask for fewer than four characters, causing read4() to fetch extra unused characters. The file may end before n characters are read. Multiple consecutive calls may occur after the file is exhausted. Another important detail is that LeetCode persists class variables across test cases, so instance state must be properly initialized rather than accidentally shared globally.

Approaches

Brute Force Approach

A naive implementation would repeatedly call read4() during every invocation of read() and only copy characters until n characters are collected.

The issue with this approach is that it does not preserve unused characters from previous read4() calls. Since read4() always advances the file pointer, any extra characters fetched are permanently consumed.

For example:

file = "abcdef"

read(3):
read4() -> "abcd"
return "abc"

read(2):
read4() -> "ef"

The character "d" disappears entirely.

Because read4() cannot rewind the file pointer, any over-read characters must be buffered somewhere. A brute force solution that ignores this fact is incorrect.

Optimal Approach

The key observation is that read4() reads in fixed chunks of four, while the requested amount n can be any value. Therefore, we must maintain an internal buffer to store leftover characters that were read but not consumed.

We keep:

  • A fixed-size internal buffer of size 4
  • A pointer to the current unread position in that buffer
  • The number of valid characters currently stored

Whenever read() is called:

  1. First consume any leftover characters already in the internal buffer.
  2. If the internal buffer becomes empty, call read4() again.
  3. Continue until either:
  • n characters are copied, or
  • end-of-file is reached.

This ensures no characters are lost between calls.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(1) Incorrect for multiple calls because extra characters are discarded
Optimal O(n) O(1) Uses persistent buffer to preserve unread characters

Algorithm Walkthrough

  1. Initialize persistent state in the constructor (or closure for Go). We maintain:
  • buffer4, a reusable internal buffer of size 4
  • buffer_pointer, the next unread index
  • buffer_count, the number of valid characters currently stored
  1. When read(buf, n) is called, initialize a variable total_read = 0. This tracks how many characters have been copied into the destination buffer.
  2. While total_read < n, check whether the internal buffer has remaining characters.
  3. If the internal buffer is exhausted (buffer_pointer == buffer_count), call read4(buffer4) to refill it.
  4. Reset buffer_pointer = 0 because we are starting from the beginning of the refreshed buffer.
  5. If read4() returns 0, we have reached end-of-file, so stop immediately.
  6. Copy characters one by one from the internal buffer into buf until:
  • we reach n requested characters, or
  • the internal buffer becomes exhausted.
  1. Update buffer_pointer as characters are consumed so future calls continue from the correct position.
  2. Return total_read.

Why it works

The invariant is that the internal buffer always stores every unread character returned by read4(). We never discard characters accidentally. The buffer_pointer ensures we resume reading exactly where the previous call stopped. Since read4() is only invoked when the internal buffer is empty, every file character is processed exactly once and returned in the correct order.

Python Solution

from typing import List

# The read4 API is already defined for you.
# def read4(buf4: List[str]) -> int:

class Solution:
    def __init__(self):
        self.buffer4 = [""] * 4
        self.buffer_pointer = 0
        self.buffer_count = 0

    def read(self, buf: List[str], n: int) -> int:
        total_read = 0

        while total_read < n:
            # Refill internal buffer if exhausted
            if self.buffer_pointer == self.buffer_count:
                self.buffer_count = read4(self.buffer4)
                self.buffer_pointer = 0

                # End of file
                if self.buffer_count == 0:
                    break

            # Copy characters from internal buffer
            while (
                total_read < n
                and self.buffer_pointer < self.buffer_count
            ):
                buf[total_read] = self.buffer4[self.buffer_pointer]
                total_read += 1
                self.buffer_pointer += 1

        return total_read

The implementation uses three instance variables to preserve state across multiple calls. buffer4 stores temporary characters fetched from read4(). buffer_pointer tracks the next unread character, while buffer_count records how many valid characters currently exist in the buffer.

Inside read(), we repeatedly attempt to satisfy the requested n characters. Before copying data, we check whether the internal buffer still has unread characters. If not, we refill it using read4().

When read4() returns 0, we know the file is exhausted and stop immediately.

The inner loop copies characters from the internal buffer into the destination buf while updating both pointers. Crucially, unused characters remain inside buffer4 for future invocations.

Go Solution

/**
 * The read4 API is already defined for you.
 *
 *     read4 := func(buf4 []byte) int
 */

var solution = func(read4 func([]byte) int) func([]byte, int) int {
    buffer4 := make([]byte, 4)
    bufferPointer := 0
    bufferCount := 0

    return func(buf []byte, n int) int {
        totalRead := 0

        for totalRead < n {
            // Refill internal buffer if exhausted
            if bufferPointer == bufferCount {
                bufferCount = read4(buffer4)
                bufferPointer = 0

                // End of file
                if bufferCount == 0 {
                    break
                }
            }

            // Copy from internal buffer
            for totalRead < n && bufferPointer < bufferCount {
                buf[totalRead] = buffer4[bufferPointer]
                totalRead++
                bufferPointer++
            }
        }

        return totalRead
    }
}

The Go solution follows the exact same logic as Python, but uses closure variables instead of instance variables. Since Go does not have a Solution class in this problem stub, we declare buffer4, bufferPointer, and bufferCount outside the returned function so they persist between calls.

Slices are used instead of Python lists, and bytes are copied directly into buf. Integer overflow is not a concern because the input size is very small.

Worked Examples

Example 1

file = "abc"
queries = [1, 2, 1]

First Call: read(buf, 1)

Step buffer4 pointer count total_read buf
Initial [] 0 0 0 []
read4() [a,b,c] 0 3 0 []
Copy 'a' [a,b,c] 1 3 1 [a]

Return 1.

Remaining internal buffer:

[b, c]

Second Call: read(buf, 2)

Step buffer4 pointer count total_read buf
Start [a,b,c] 1 3 0 []
Copy 'b' [a,b,c] 2 3 1 [b]
Copy 'c' [a,b,c] 3 3 2 [b,c]

Return 2.

Third Call: read(buf, 1)

Step buffer4 pointer count total_read buf
Exhausted [a,b,c] 3 3 0 []
read4() [] 0 0 0 []

Return 0.

Example 2

file = "abc"
queries = [4, 1]

First Call: read(buf, 4)

Step buffer4 pointer count total_read
read4() [a,b,c] 0 3 0
Copy all [a,b,c] 3 3 3
read4() [] 0 0 3

Return 3.

Second Call: read(buf, 1)

read4() immediately returns 0.

Return 0.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character is copied at most once
Space O(1) Internal buffer size is fixed at 4

The time complexity is O(n) because we copy at most n characters into the destination buffer during each call. Although read4() may occasionally read extra characters, each file character is processed only once overall.

The space complexity is O(1) because the auxiliary buffer is always size 4, regardless of input size.

Test Cases

# Mock implementation for local testing

class MockSolution:
    def __init__(self, content):
        self.file = content
        self.file_pointer = 0
        self.buffer4 = [""] * 4
        self.buffer_pointer = 0
        self.buffer_count = 0

    def read4(self, buf4):
        count = 0
        while count < 4 and self.file_pointer < len(self.file):
            buf4[count] = self.file[self.file_pointer]
            self.file_pointer += 1
            count += 1
        return count

    def read(self, buf, n):
        total_read = 0

        while total_read < n:
            if self.buffer_pointer == self.buffer_count:
                self.buffer_count = self.read4(self.buffer4)
                self.buffer_pointer = 0

                if self.buffer_count == 0:
                    break

            while (
                total_read < n
                and self.buffer_pointer < self.buffer_count
            ):
                buf[total_read] = self.buffer4[self.buffer_pointer]
                total_read += 1
                self.buffer_pointer += 1

        return total_read

# Example 1
sol = MockSolution("abc")
buf = [""] * 10
assert sol.read(buf, 1) == 1  # Read first character
assert buf[:1] == ["a"]

assert sol.read(buf, 2) == 2  # Read remaining characters
assert buf[:2] == ["b", "c"]

assert sol.read(buf, 1) == 0  # EOF reached

# Example 2
sol = MockSolution("abc")
buf = [""] * 10
assert sol.read(buf, 4) == 3  # Request exceeds file length
assert buf[:3] == ["a", "b", "c"]

assert sol.read(buf, 1) == 0  # EOF on subsequent call

# Exact multiple of 4
sol = MockSolution("abcdefgh")
buf = [""] * 10
assert sol.read(buf, 4) == 4  # First chunk
assert sol.read(buf, 4) == 4  # Second chunk

# Less than 4 with leftovers
sol = MockSolution("abcdef")
buf = [""] * 10
assert sol.read(buf, 3) == 3  # Leaves one buffered char
assert buf[:3] == ["a", "b", "c"]

assert sol.read(buf, 2) == 2  # Uses leftover + new char
assert buf[:2] == ["d", "e"]

# Single character file
sol = MockSolution("x")
buf = [""] * 10
assert sol.read(buf, 1) == 1  # Single character
assert sol.read(buf, 1) == 0  # EOF

# Multiple EOF calls
sol = MockSolution("ab")
buf = [""] * 10
assert sol.read(buf, 5) == 2  # Entire file consumed
assert sol.read(buf, 5) == 0  # EOF again
assert sol.read(buf, 1) == 0  # Still EOF
Test Why
"abc", [1,2,1] Validates persistent buffering across calls
"abc", [4,1] Ensures EOF handling when request exceeds length
"abcdefgh", [4,4] Tests exact multiples of four
"abcdef", [3,2] Verifies leftover buffer logic
"x", [1,1] Smallest file size
Multiple EOF calls Ensures repeated reads after EOF return 0

Edge Cases

Reading Fewer Than Four Characters

A common source of bugs occurs when read4() retrieves more characters than the current request actually needs. For example, if the file is "abcdef" and we call read(3), read4() may fetch "abcd". A naive implementation might accidentally discard "d".

This implementation avoids the issue by maintaining a persistent internal buffer and pointer. Unused characters remain stored for future calls.

Requesting More Characters Than Exist

If n exceeds the remaining file length, we should return only the available characters. For example, reading 4 characters from "abc" should return 3.

The implementation correctly handles this by checking if read4() returns 0, which signals end-of-file. At that point, reading stops immediately.

Multiple Calls After End-of-File

After the file is exhausted, future calls should consistently return 0 instead of attempting invalid reads.

The implementation handles this naturally. Once read4() returns 0, no additional characters can ever be loaded into the internal buffer, so subsequent calls terminate immediately.

Leftover Characters Across Calls

This is the defining challenge of the problem. Characters fetched but not consumed in one call must remain available later.

For example:

file = "abcdef"

read(1) -> "a"
read(2) -> "bc"
read(3) -> "def"

Without persistent state, this sequence would fail because extra characters fetched during the first call would be lost. The internal buffer mechanism guarantees correctness by preserving unread data exactly as required.