LeetCode 567 - Permutation in String

This problem asks us to determine whether any permutation of s1 appears as a contiguous substring inside s2. A permutation means the characters are rearranged, but the frequency of each character remains the same. For example, the permutations of "ab" are "ab" and "ba".

LeetCode Problem 567

Difficulty: 🟡 Medium
Topics: Hash Table, Two Pointers, String, Sliding Window

Solution

Problem Understanding

This problem asks us to determine whether any permutation of s1 appears as a contiguous substring inside s2.

A permutation means the characters are rearranged, but the frequency of each character remains the same. For example, the permutations of "ab" are "ab" and "ba". If either of these appears somewhere inside s2, then the answer is true.

The important detail is that the substring in s2 must be contiguous. We are not looking for a subsequence where characters can be skipped.

The input consists of two lowercase English strings:

  • s1, the pattern whose permutation we want to find
  • s2, the larger string in which we search

The expected output is:

  • true if some substring of s2 has exactly the same character frequencies as s1
  • false otherwise

The constraints are important:

  • Both strings can have length up to 10^4
  • Only lowercase English letters are used

A brute-force solution that generates all permutations of s1 would be impossible for larger inputs because the number of permutations grows factorially. Even checking every substring inefficiently could become too slow.

The lowercase English letter constraint is very useful because it means there are only 26 possible characters. This allows us to use fixed-size frequency arrays instead of expensive dynamic hash maps.

Several edge cases are important:

  • If s1 is longer than s2, the answer must immediately be false
  • Repeated characters matter, for example "aab" is different from "ab"
  • A valid permutation can appear anywhere in s2
  • The matching substring must have the same length as s1

Approaches

Brute Force Approach

A straightforward approach is to examine every substring of s2 whose length equals len(s1).

For each substring, we can count character frequencies and compare them with the frequency count of s1. If the counts match, then the substring is a permutation of s1.

For example:

  • s1 = "ab"
  • s2 = "eidbaooo"

We would examine:

  • "ei"
  • "id"
  • "db"
  • "ba"
  • "ao"
  • "oo"
  • "oo"

When we reach "ba", its character frequencies match "ab", so we return true.

This approach is correct because every possible candidate substring is checked. However, recalculating frequencies for every substring is inefficient.

If:

  • n = len(s2)
  • m = len(s1)

then for each of roughly n - m + 1 substrings, we spend O(m) time counting characters. This leads to O(n * m) time complexity.

With strings up to 10^4, this can become expensive.

Optimal Approach, Sliding Window

The key observation is that all candidate substrings have the same length, namely len(s1).

Instead of recomputing character frequencies from scratch for every substring, we can maintain a sliding window of size len(s1) over s2.

As the window moves:

  • One character enters the window
  • One character leaves the window

We update the frequency counts incrementally rather than rebuilding them each time.

Since only lowercase English letters are involved, we can store frequencies in arrays of size 26. Comparing two frequency arrays takes constant time because the size is fixed.

This transforms the solution into an efficient sliding window algorithm.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * m) O(1) Recomputes frequencies for every substring
Optimal O(n) O(1) Uses sliding window with fixed-size frequency arrays

Algorithm Walkthrough

  1. First, check whether s1 is longer than s2.

If it is, then no permutation can possibly fit inside s2, so we immediately return false. 2. Create two frequency arrays of size 26.

One array stores character counts for s1, and the other stores counts for the current sliding window in s2.

Since all characters are lowercase English letters, each index corresponds to a character:

  • index 0 represents 'a'
  • index 1 represents 'b'
  • and so on
  1. Populate both frequency arrays for the first window.

Count:

  • all characters in s1
  • the first len(s1) characters in s2
  1. Compare the two frequency arrays.

If they are equal, then the first window itself is a valid permutation, so return true. 5. Begin sliding the window across s2.

For each step:

  • add the new character entering the window
  • remove the old character leaving the window

This keeps the window size fixed. 6. After updating the window frequencies, compare the two arrays again.

If they match, then the current window contains a permutation of s1. 7. If the entire string is processed without finding a match, return false.

Why it works

The algorithm maintains the invariant that the sliding window always represents a substring of s2 with exactly the same length as s1.

A substring is a permutation of s1 if and only if both strings contain identical character frequencies.

Because the window frequencies are updated correctly whenever the window moves, every candidate substring is checked exactly once. Therefore, the algorithm correctly identifies whether any permutation exists.

Python Solution

class Solution:
    def checkInclusion(self, s1: str, s2: str) -> bool:
        if len(s1) > len(s2):
            return False

        target_count = [0] * 26
        window_count = [0] * 26

        window_size = len(s1)

        for i in range(window_size):
            target_count[ord(s1[i]) - ord('a')] += 1
            window_count[ord(s2[i]) - ord('a')] += 1

        if target_count == window_count:
            return True

        for right in range(window_size, len(s2)):
            add_index = ord(s2[right]) - ord('a')
            remove_index = ord(s2[right - window_size]) - ord('a')

            window_count[add_index] += 1
            window_count[remove_index] -= 1

            if target_count == window_count:
                return True

        return False

The implementation begins with an early rejection check. If s1 is longer than s2, no valid substring can exist.

Two arrays of length 26 are then created. These arrays store character frequencies for:

  • s1
  • the current sliding window in s2

The first loop initializes both arrays using the first window of s2.

After initialization, the algorithm compares the arrays. If they already match, the answer is immediately true.

The main sliding window loop then starts from index window_size and moves through s2.

For each iteration:

  • one character is added to the window
  • one character is removed from the left side

This keeps the window size constant while updating frequencies efficiently.

Finally, after each update, the algorithm checks whether the frequency arrays match. If they do, the current window is a valid permutation.

If no match is found after processing all windows, the method returns false.

Go Solution

func checkInclusion(s1 string, s2 string) bool {
    if len(s1) > len(s2) {
        return false
    }

    targetCount := make([]int, 26)
    windowCount := make([]int, 26)

    windowSize := len(s1)

    for i := 0; i < windowSize; i++ {
        targetCount[s1[i]-'a']++
        windowCount[s2[i]-'a']++
    }

    if equalCounts(targetCount, windowCount) {
        return true
    }

    for right := windowSize; right < len(s2); right++ {
        windowCount[s2[right]-'a']++
        windowCount[s2[right-windowSize]-'a']--

        if equalCounts(targetCount, windowCount) {
            return true
        }
    }

    return false
}

func equalCounts(a []int, b []int) bool {
    for i := 0; i < 26; i++ {
        if a[i] != b[i] {
            return false
        }
    }

    return true
}

The Go implementation follows the same logic as the Python version.

One notable difference is that Go does not support direct array equality checks for slices in the same convenient way Python compares lists. Because of this, a helper function equalCounts is used to compare the two frequency arrays element by element.

Character indexing is also slightly different. In Go, characters are accessed as bytes, so expressions like s1[i] - 'a' directly produce indices from 0 to 25.

No special handling for integer overflow is needed because the maximum frequency count is bounded by the string length, which is at most 10^4.

Worked Examples

Example 1

Input:

s1 = "ab"
s2 = "eidbaooo"

Target frequency array:

Character Count
a 1
b 1

Window size is 2.

Initial window is "ei".

Window Counts Match?
"ei" No

Slide the window one step at a time.

Step Window Added Removed Counts Match?
1 "id" d e No
2 "db" b i No
3 "ba" a d Yes

At window "ba", the frequency counts match "ab".

Result:

true

Example 2

Input:

s1 = "ab"
s2 = "eidboaoo"

Target frequency array:

Character Count
a 1
b 1

Window size is 2.

Step Window Counts Match?
Initial "ei" No
1 "id" No
2 "db" No
3 "bo" No
4 "oa" No
5 "ao" No
6 "oo" No

No window matches the target frequencies.

Result:

false

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character is processed once while sliding the window
Space O(1) Two fixed-size arrays of length 26 are used

The time complexity is linear because the sliding window moves across s2 exactly once. Updating the frequency arrays takes constant time for each movement.

Although array comparisons examine 26 elements each time, 26 is a fixed constant, so this still counts as constant work.

The space complexity is constant because the algorithm always uses two arrays of fixed size 26, regardless of input size.

Test Cases

solution = Solution()

assert solution.checkInclusion("ab", "eidbaooo") == True   # basic positive case
assert solution.checkInclusion("ab", "eidboaoo") == False  # basic negative case

assert solution.checkInclusion("a", "a") == True           # single character exact match
assert solution.checkInclusion("a", "b") == False          # single character mismatch

assert solution.checkInclusion("abc", "bbbca") == True     # permutation appears near end
assert solution.checkInclusion("adc", "dcda") == True      # overlapping valid window

assert solution.checkInclusion("hello", "ooolleoooleh") == False  # repeated chars mismatch
assert solution.checkInclusion("aab", "eidbaaboo") == True        # repeated characters valid

assert solution.checkInclusion("xyz", "afdgzyxksldfm") == True    # permutation in middle
assert solution.checkInclusion("abcd", "abc") == False            # s1 longer than s2

assert solution.checkInclusion("aa", "aaa") == True               # repeated identical chars
assert solution.checkInclusion("abc", "ccccbbbbaaaa") == False    # no valid frequencies

assert solution.checkInclusion("abc", "cba") == True              # entire string is permutation
assert solution.checkInclusion("abc", "defghijkl") == False       # completely unrelated chars
Test Why
"ab", "eidbaooo" Standard positive example
"ab", "eidboaoo" Standard negative example
"a", "a" Smallest valid matching input
"a", "b" Smallest non-matching input
"abc", "bbbca" Valid permutation near the end
"adc", "dcda" Tests overlapping windows
"hello", "ooolleoooleh" Repeated characters with incorrect counts
"aab", "eidbaaboo" Handles duplicate characters correctly
"xyz", "afdgzyxksldfm" Permutation appears in middle
"abcd", "abc" Early rejection when s1 is longer
"aa", "aaa" Repeated identical characters
"abc", "ccccbbbbaaaa" No matching frequency window
"abc", "cba" Entire string is the permutation
"abc", "defghijkl" Completely unrelated characters

Edge Cases

One important edge case occurs when s1 is longer than s2. In this situation, no substring of s2 can possibly have the same length as s1, so the answer must be false. A naive implementation might still attempt sliding window operations and produce incorrect indexing behavior. The implementation handles this immediately with an early return.

Another important edge case involves repeated characters. For example, s1 = "aab" requires exactly two 'a' characters and one 'b'. A buggy implementation that only checks character presence rather than exact frequencies would incorrectly accept substrings like "abc". The frequency-array approach correctly tracks exact counts for every character.

A third edge case appears when the valid permutation occurs at the very beginning or very end of s2. Some implementations incorrectly initialize the first window or stop the sliding process too early. This solution explicitly checks the initial window before sliding and continues processing until the final possible window is examined.

Another subtle case is when all characters are identical, such as s1 = "aa" and s2 = "aaa". The algorithm still works correctly because the frequency arrays capture duplicate counts accurately, and the sliding window updates maintain the correct counts as characters enter and leave the window.