LeetCode 3037 - Find Pattern in Infinite Stream II

The problem requires us to detect the first occurrence of a binary pattern within a theoretically infinite stream of bits. The stream is accessed sequentially using the next() method, which returns one bit at a time.

LeetCode Problem 3037

Difficulty: 🔴 Hard
Topics: Array, Sliding Window, Rolling Hash, String Matching, Interactive, Hash Function

Solution

Problem Understanding

The problem requires us to detect the first occurrence of a binary pattern within a theoretically infinite stream of bits. The stream is accessed sequentially using the next() method, which returns one bit at a time. Our task is to determine the index in the stream where the given pattern first fully matches a contiguous sequence of bits.

The input consists of two components: pattern, which is a finite array of 0s and 1s, and stream, which is an InfiniteStream object. The output is an integer representing the 0-based starting index of the first complete match of pattern in the stream.

The constraints tell us that pattern has a length up to 10^4, and we are guaranteed that the pattern occurs within the first 10^5 bits of the stream. This is significant because it bounds the number of next() calls we need to make and ensures our algorithm does not need to handle truly infinite streams in practice.

Important edge cases include a pattern of length 1, streams starting with the pattern, repeated patterns in the stream, and scenarios where the pattern occurs near the upper bound of the first 10^5 bits. The guarantee that the pattern exists prevents the need to handle a "not found" result.

Approaches

The brute-force approach would involve reading each bit from the stream sequentially and checking for a match with the pattern from that position. This is correct but inefficient, especially if the pattern is long. For each starting index i, we would compare pattern.length bits, which results in an O(N * M) time complexity for N bits read and M the pattern length. While this is acceptable given the 10^5 bound, it is suboptimal for larger streams or longer patterns.

A better approach uses rolling hash or Rabin-Karp string matching. The idea is to maintain a hash of the last pattern.length bits seen in the stream and compare it to the precomputed hash of the pattern. This allows constant-time updates per bit using the properties of rolling hash, avoiding repeated array comparisons and yielding a linear O(N) time complexity relative to the number of bits read.

The key insight is that we do not need to store the entire stream, only the last pattern.length bits, and hash comparisons are sufficient to detect matches reliably.

Approach Time Complexity Space Complexity Notes
Brute Force O(N * M) O(M) Compare pattern with each window of size M explicitly
Rolling Hash O(N) O(M) Use a hash of last M bits and update incrementally per new bit

Algorithm Walkthrough

  1. Compute the integer hash of the pattern by treating it as a binary number.
  2. Initialize a rolling hash for the stream, starting at 0. Also, maintain a queue or buffer of the last pattern.length bits to update the hash correctly.
  3. Iterate over the stream using next() to read one bit at a time.
  4. For each bit, shift the rolling hash left by one and add the new bit. Apply modulo 2^pattern_length to remove the oldest bit and keep the hash within pattern.length bits.
  5. When the buffer reaches the length of the pattern, compare the rolling hash to the pattern hash.
  6. If the hashes match, return the current start index, which is the current index minus pattern.length plus one.
  7. Otherwise, continue reading the next bit until a match is found.

Why it works: The rolling hash maintains a constant-time representation of the last pattern.length bits in the stream. Because the pattern's occurrence is guaranteed, the algorithm will eventually find a hash match corresponding to the exact sequence of bits. The modulo operation ensures only the last pattern.length bits influence the hash, correctly sliding the window along the stream.

Python Solution

from typing import List, Optional

# Definition for an infinite stream.
# class InfiniteStream:
#     def next(self) -> int:
#         pass

class Solution:
    def findPattern(self, stream: Optional['InfiniteStream'], pattern: List[int]) -> int:
        pattern_len = len(pattern)
        pattern_hash = 0
        for bit in pattern:
            pattern_hash = (pattern_hash << 1) | bit
        
        rolling_hash = 0
        buffer = []
        index = 0
        
        while True:
            bit = stream.next()
            rolling_hash = ((rolling_hash << 1) | bit) & ((1 << pattern_len) - 1)
            buffer.append(bit)
            if len(buffer) > pattern_len:
                buffer.pop(0)
            if len(buffer) == pattern_len and rolling_hash == pattern_hash:
                return index - pattern_len + 1
            index += 1

In this Python implementation, we first compute the hash of the pattern by treating it as a binary number. We then initialize the rolling hash and read each bit from the stream. The buffer ensures we maintain only the last pattern.length bits. By shifting and masking the rolling hash, we efficiently slide the window along the stream. Once the rolling hash matches the pattern hash, we return the correct starting index.

Go Solution

/**
 * Definition for an infinite stream.
 * type InfiniteStream interface {
 *     Next() int
 * }
 */
func findPattern(stream InfiniteStream, pattern []int) int {
    patternLen := len(pattern)
    patternHash := 0
    for _, bit := range pattern {
        patternHash = (patternHash << 1) | bit
    }

    rollingHash := 0
    buffer := make([]int, 0, patternLen)
    index := 0

    mask := (1 << patternLen) - 1

    for {
        bit := stream.Next()
        rollingHash = ((rollingHash << 1) | bit) & mask
        buffer = append(buffer, bit)
        if len(buffer) > patternLen {
            buffer = buffer[1:]
        }
        if len(buffer) == patternLen && rollingHash == patternHash {
            return index - patternLen + 1
        }
        index++
    }
}

In Go, we handle slices carefully to ensure efficient memory usage, pre-allocating the buffer capacity. Masking with (1 << patternLen) - 1 maintains only the last patternLen bits. The algorithm mirrors the Python version in logic but is adapted to Go syntax and slice semantics.

Worked Examples

Example 1: pattern = [0,1], stream = [1,1,1,0,1,...]

Step Bit read Rolling hash Buffer Index Match?
0 1 1 [1] 0 No
1 1 3 [1,1] 1 No
2 1 3 [1,1] 2 No
3 0 2 [1,0] 3 No
4 1 1 [0,1] 4 Yes

First occurrence index = 3.

Example 2: pattern = [0], stream = [0,0,0,...]

Step Bit read Rolling hash Buffer Index Match?
0 0 0 [0] 0 Yes

First occurrence index = 0.

Example 3: pattern = [1,1,0,1], stream = [1,0,1,1,0,1,...]

Step Bit read Rolling hash Buffer Index Match?
0 1 1 [1] 0 No
1 0 2 [1,0] 1 No
2 1 5 [1,0,1] 2 No
3 1 11 [1,0,1,1] 3 No
4 0 6 [0,1,1,0] 4 No
5 1 13 [1,1,0,1] 5 Yes

First occurrence index = 2.

Complexity Analysis

Measure Complexity Explanation
Time O(N) Each bit in the first 10^5 bits is processed once with constant-time hash updates
Space O(M) We store only the last pattern.length bits in a buffer

This complexity is efficient because it leverages constant-time updates per bit without storing the entire stream, and the guaranteed occurrence ensures termination.

Test Cases

# Example tests
assert Solution().findPattern(InfiniteStream([1,1,1,0,1]), [0,1]) == 3  # first example
assert Solution().findPattern(InfiniteStream([0,0,0,0]), [0]) == 0  # second example
assert Solution().findPattern(InfiniteStream([1,0,1,1,0,1,1,0]), [1,1,0,1]) == 2  # third example

# Edge and boundary cases
assert Solution().findPattern(In