LeetCode 1461 - Check If a String Contains All Binary Codes of Size K

The problem gives us a binary string s and an integer k. A binary string contains only the characters '0' and '1'. We mu

LeetCode Problem 1461

Difficulty: 🟡 Medium
Topics: Hash Table, String, Bit Manipulation, Rolling Hash, Hash Function

Solution

Problem Understanding

The problem gives us a binary string s and an integer k. A binary string contains only the characters '0' and '1'. We must determine whether every possible binary code of length k appears somewhere inside s as a substring.

A binary code of length k is simply any sequence of k bits. Since each position can independently be 0 or 1, there are exactly:

$2^k$

possible binary codes of length k.

For example, if k = 2, the possible codes are:

  • "00"
  • "01"
  • "10"
  • "11"

We need to check whether all of these appear somewhere in the string.

A substring means a contiguous section of the string. If s = "00110110" and k = 2, the length-2 substrings are:

  • "00"
  • "01"
  • "11"
  • "10"
  • "01"
  • "11"
  • "10"

All four possible binary codes appear, so the answer is true.

The constraints are important:

  • s.length can be as large as 500,000
  • k can be up to 20

These limits immediately rule out inefficient solutions. A naive approach that repeatedly searches the string for every possible binary code would become too slow when k is large.

Another important observation comes from the total number of possible substrings. A string of length n contains at most:

$n-k+1$

substrings of length k.

If this quantity is smaller than 2^k, then it is impossible for all codes to exist. This gives us an important early optimization.

Some important edge cases include:

  • Very small strings
  • k = 1
  • Strings that contain many repeated patterns
  • Cases where the number of available substrings is smaller than 2^k
  • Large values of k, where generating all binary strings explicitly becomes expensive

The problem guarantees that the string contains only '0' and '1', which simplifies the implementation.

Approaches

Brute Force Approach

The most direct approach is to generate every possible binary string of length k, then check whether each one exists as a substring inside s.

For example, if k = 3, we would generate:

  • "000"
  • "001"
  • "010"
  • "011"
  • "100"
  • "101"
  • "110"
  • "111"

For each generated string, we could use substring search to determine whether it appears in s.

This approach is correct because it explicitly verifies every required binary code. However, it is too slow for the given constraints.

There are 2^k possible codes, and substring searching costs up to O(n) each time. The total complexity becomes:

$O(n\cdot 2^k)$

When k = 20, there are more than one million possible binary codes, making this infeasible.

Optimal Approach

The key insight is that we do not need to generate all binary strings explicitly. Instead, we can scan through the input string once and record every distinct substring of length k.

If the number of distinct substrings we discover equals 2^k, then every binary code exists.

A rolling hash or bitmask technique makes this especially efficient.

Because the string contains only binary digits, each substring of length k can be represented as an integer between 0 and 2^k - 1.

For example:

Binary Integer
00 0
01 1
10 2
11 3

As we slide a window across the string, we continuously update this integer representation in constant time using bit operations.

This avoids repeated substring construction and gives an efficient linear-time solution.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n × 2^k) O(1) Generates every possible binary code and searches for each
Optimal O(n) O(2^k) Uses rolling bitmask to track observed substrings

Algorithm Walkthrough

  1. First, compute the total number of binary codes we must find.

There are:

$2^k$

possible binary strings of length k. 2. Check whether the string is even long enough to contain all codes.

A string of length n contains at most:

$n-k+1$

substrings of length k.

If this number is smaller than 2^k, immediately return false. 3. Create a set to store all distinct binary codes encountered.

We use a set because we only care whether a code has appeared, not how many times. 4. Maintain a rolling integer value representing the current window of size k.

As we move through the string:

  • Shift the current value left by one bit
  • Add the new bit from the string
  • Remove bits beyond length k using a mask
  1. The mask ensures that only the last k bits remain.

The mask value is:

$2^k-1$

This keeps the rolling value within exactly k bits. 6. Once we have processed at least k characters, add the current rolling value to the set. 7. If the set size reaches 2^k, return true immediately.

This means we have discovered every possible binary code. 8. If the loop finishes without finding all codes, return false.

Why it works

Every substring of length k corresponds uniquely to a k-bit integer. The rolling bitmask always represents the current substring window exactly. By scanning every possible window once and storing all distinct values, we collect precisely the set of binary codes present in the string. If the number of distinct codes equals 2^k, then every possible binary code has appeared.

Python Solution

class Solution:
    def hasAllCodes(self, s: str, k: int) -> bool:
        n = len(s)
        total_codes = 1 << k
        
        # Not enough substrings to contain all codes
        if n - k + 1 < total_codes:
            return False
        
        seen = set()
        current = 0
        mask = total_codes - 1
        
        for i, char in enumerate(s):
            current = ((current << 1) & mask) | int(char)
            
            # Start recording once window size reaches k
            if i >= k - 1:
                seen.add(current)
                
                # Early exit if all codes found
                if len(seen) == total_codes:
                    return True
        
        return False

The implementation begins by calculating the total number of possible binary codes using 1 << k, which computes 2^k.

The early impossibility check is extremely important. If the string does not contain enough length-k substrings, there is no need to continue.

The seen set stores integer representations of discovered binary substrings.

The variable current acts as a rolling hash. Each iteration shifts the previous bits left and inserts the new bit from the string.

The expression:

((current << 1) & mask)

removes any extra bits beyond length k.

Once the index reaches k - 1, we know the current rolling value represents a complete substring of size k, so we add it to the set.

The early return optimization avoids unnecessary work once all codes are discovered.

Go Solution

func hasAllCodes(s string, k int) bool {
	n := len(s)
	totalCodes := 1 << k

	// Not enough substrings to contain all codes
	if n-k+1 < totalCodes {
		return false
	}

	seen := make(map[int]bool)
	current := 0
	mask := totalCodes - 1

	for i := 0; i < n; i++ {
		bit := int(s[i] - '0')

		current = ((current << 1) & mask) | bit

		// Start recording once window size reaches k
		if i >= k-1 {
			seen[current] = true

			// Early exit if all codes found
			if len(seen) == totalCodes {
				return true
			}
		}
	}

	return false
}

The Go implementation follows the same logic as the Python version.

Instead of a Python set, Go uses map[int]bool to track observed binary codes.

Character conversion differs slightly because Go strings are byte slices. The expression:

int(s[i] - '0')

converts the character '0' or '1' into integers 0 or 1.

Go integers are large enough for the constraints because k <= 20, meaning the maximum stored value is less than 2^20.

Worked Examples

Example 1

Input:

s = "00110110"
k = 2

Total required codes:

2^2 = 4

Mask:

11 (binary) = 3
Index Char Current Binary Current Integer Seen Set
0 0 0 0 {}
1 0 00 0 {0}
2 1 01 1 {0,1}
3 1 11 3 {0,1,3}
4 0 10 2 {0,1,2,3}

At this point, all four codes are found, so return true.

Example 2

Input:

s = "0110"
k = 1

Possible codes:

  • 0
  • 1
Index Char Current Seen
0 0 0 {0}
1 1 1 {0,1}

All codes are found, so return true.

Example 3

Input:

s = "0110"
k = 2

Possible codes:

  • 00
  • 01
  • 10
  • 11
Index Window Integer Seen
1 01 1 {1}
2 11 3 {1,3}
3 10 2 {1,2,3}

The code 00 never appears, so return false.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character is processed once
Space O(2^k) The set may store every possible binary code

The algorithm scans the string exactly once. Every operation inside the loop is constant time because bit manipulation and hash set insertion are both efficient.

The space complexity depends on how many unique binary codes are stored. In the worst case, we may store all 2^k possible codes.

Test Cases

solution = Solution()

# Provided examples
assert solution.hasAllCodes("00110110", 2) == True  # all 2-bit codes exist
assert solution.hasAllCodes("0110", 1) == True      # both 0 and 1 exist
assert solution.hasAllCodes("0110", 2) == False     # missing "00"

# Minimum size input
assert solution.hasAllCodes("0", 1) == False        # missing "1"
assert solution.hasAllCodes("1", 1) == False        # missing "0"

# Exact coverage
assert solution.hasAllCodes("00110", 2) == True     # contains all 2-bit patterns

# Impossible due to insufficient substrings
assert solution.hasAllCodes("000", 3) == False      # only one substring available

# Repeated patterns
assert solution.hasAllCodes("111111", 2) == False   # only "11" appears

# Large repeated block
assert solution.hasAllCodes("0000000000", 1) == False  # missing 1

# All 1-bit codes
assert solution.hasAllCodes("01", 1) == True

# Missing one pattern
assert solution.hasAllCodes("0011", 2) == False     # missing "10"

# Longer valid case
assert solution.hasAllCodes("0001011100", 3) == True  # contains all 3-bit codes

Test Summary

Test Why
"00110110", 2 Standard successful case
"0110", 1 Smallest meaningful k
"0110", 2 Missing one required code
"0", 1 Minimum input size
"1", 1 Another minimum boundary
"00110", 2 Exact coverage without extra substrings
"000", 3 Impossible because too few substrings
"111111", 2 Repeated identical pattern
"0000000000", 1 Only one binary digit present
"01", 1 Minimal successful input
"0011", 2 Near-complete but missing one pattern
"0001011100", 3 Larger valid example

Edge Cases

One important edge case occurs when the string is too short to possibly contain all binary codes. For example, if s = "000" and k = 3, there is only one substring of length three, but eight different binary codes are required. A naive implementation might still perform unnecessary work. The early impossibility check prevents this inefficiency and immediately returns false.

Another important case involves repeated patterns. Consider s = "111111" with k = 2. The only substring encountered is "11" repeatedly. Without proper handling of duplicates, an implementation might incorrectly count repeated appearances as new codes. Using a set guarantees that each binary code is counted only once.

A third edge case occurs when k = 1. In this scenario, the algorithm must simply verify that both '0' and '1' appear somewhere in the string. The rolling bitmask implementation still works correctly because the mask becomes 1, preserving exactly one bit during processing.

A final subtle edge case involves large values of k, especially near the maximum value of 20. Constructing substring strings repeatedly would become expensive in both time and memory. Representing each substring as an integer through bit manipulation avoids unnecessary string allocations and keeps the algorithm efficient even at maximum constraints.