LeetCode 727 - Minimum Window Subsequence

This problem asks us to find the smallest contiguous substring of s1 such that s2 appears inside that substring as a subsequence. A subsequence does not require characters to be adjacent, but they must appear in the same relative order.

LeetCode Problem 727

Difficulty: 🔴 Hard
Topics: String, Dynamic Programming, Sliding Window

Solution

Problem Understanding

This problem asks us to find the smallest contiguous substring of s1 such that s2 appears inside that substring as a subsequence.

A subsequence does not require characters to be adjacent, but they must appear in the same relative order. For example, "bde" is a subsequence of "bcde" because we can pick 'b', then 'd', then 'e' while preserving order.

The important detail is that the answer itself must be a contiguous substring of s1, even though the matching characters for s2 inside that substring do not need to be contiguous.

We are given two strings:

  • s1, the larger source string
  • s2, the target subsequence

We must return:

  • The shortest substring of s1 that contains s2 as a subsequence
  • If multiple substrings have the same minimum length, we return the one with the smallest starting index
  • If no valid substring exists, we return ""

Consider the example:

s1 = "abcdebdde"
s2 = "bde"

The substring "bcde" works because:

  • 'b' matches
  • 'd' appears later
  • 'e' appears later

The substring "bdde" also works, but both have length 4, and "bcde" appears earlier.

The constraints are important:

  • s1.length <= 2 * 10^4
  • s2.length <= 100

This tells us that:

  • s1 can be fairly large
  • s2 is relatively small

A fully quadratic or cubic solution over s1 would likely be too slow. However, algorithms involving O(len(s1) * len(s2)) are reasonable because s2 is small.

Several edge cases are important:

  • s2 may contain characters not present in s1
  • Multiple minimum windows may exist
  • The optimal window may appear near the end
  • s2 may already equal an exact substring
  • s2 may require skipping many characters in s1
  • No valid subsequence may exist at all

A naive substring checking implementation can easily become too slow because there are O(n^2) substrings.

Approaches

Brute Force Approach

The brute force solution considers every possible substring of s1.

For each substring:

  1. Extract the substring
  2. Check whether s2 is a subsequence of it
  3. If valid, compare its length with the current best answer

To check whether s2 is a subsequence, we use two pointers:

  • One pointer scans the substring
  • One pointer scans s2

Whenever characters match, we advance the s2 pointer.

If we finish scanning s2, then the substring is valid.

This approach is correct because it explicitly checks every possible candidate window. Therefore, it cannot miss the optimal answer.

However, it is extremely inefficient.

There are O(n^2) substrings, and each subsequence check may take O(n) time. This leads to roughly O(n^3) complexity in the worst case, which is far too slow for n = 20,000.

Optimal Approach, Forward Scan + Backward Shrink

The key observation is that once we discover a valid subsequence ending at some position, we can shrink the window backward to find the smallest possible starting point for that ending index.

The algorithm works in two phases repeatedly:

  1. Forward scan:
  • Find a complete subsequence match for s2
  1. Backward scan:
  • Move backward to minimize the window while preserving the subsequence

This avoids checking every substring explicitly.

The important insight is that after locating a valid ending position, we can greedily shrink the left boundary without losing validity.

Because s2 is small, each forward and backward scan is manageable.

Approach Time Complexity Space Complexity Notes
Brute Force O(n³) O(1) Checks every substring explicitly
Optimal O(n * m) O(1) Uses forward subsequence matching and backward shrinking

Here:

  • n = len(s1)
  • m = len(s2)

Algorithm Walkthrough

Optimal Algorithm

  1. Initialize variables to track the best window found so far.

We store:

  • best_start
  • best_length

Initially, no valid window has been found. 2. Start scanning s1 from left to right.

Use pointer i for s1. 3. Whenever s1[i] matches the first character of s2, attempt a forward subsequence match.

We use:

  • s1_idx to scan s1
  • s2_idx to scan s2
  1. Continue advancing through s1.

Whenever:

s1[s1_idx] == s2[s2_idx]

increment s2_idx. 5. If s2_idx reaches len(s2), we found a valid subsequence.

The current s1_idx - 1 becomes the ending position of a valid window. 6. Perform a backward shrinking step.

Now we minimize the window while preserving the subsequence.

Set:

  • end = s1_idx - 1
  • s2_idx = len(s2) - 1
  1. Move backward through s1.

Whenever:

s1[s1_idx] == s2[s2_idx]

decrement s2_idx.

Continue until all characters of s2 are matched backward. 8. The position after the final backward move becomes the minimal starting index.

Compute:

  • start
  • window_length
  1. Update the best answer if:
  • The current window is shorter
  • Or equal length but earlier
  1. Resume scanning after the start of the current window.

This avoids unnecessary repeated work.

Why it works

The forward scan guarantees that we locate a valid subsequence ending position. The backward scan then greedily removes unnecessary characters from the left side while still preserving the subsequence ordering.

Because the backward step always finds the earliest possible valid start for a fixed ending position, every discovered window is locally minimal. Comparing all such minimal windows guarantees the globally minimum answer.

Python Solution

class Solution:
    def minWindow(self, s1: str, s2: str) -> str:
        n = len(s1)
        m = len(s2)

        best_start = -1
        best_length = float("inf")

        i = 0

        while i < n:
            if s1[i] == s2[0]:
                s1_idx = i
                s2_idx = 0

                # Forward scan to match entire s2
                while s1_idx < n and s2_idx < m:
                    if s1[s1_idx] == s2[s2_idx]:
                        s2_idx += 1
                    s1_idx += 1

                # Entire subsequence matched
                if s2_idx == m:
                    end = s1_idx - 1

                    # Backward shrink
                    s2_idx = m - 1
                    s1_idx = end

                    while s2_idx >= 0:
                        if s1[s1_idx] == s2[s2_idx]:
                            s2_idx -= 1
                        s1_idx -= 1

                    start = s1_idx + 1
                    window_length = end - start + 1

                    if window_length < best_length:
                        best_length = window_length
                        best_start = start

                    # Restart from next position after start
                    i = start

            i += 1

        if best_start == -1:
            return ""

        return s1[best_start:best_start + best_length]

The implementation follows the algorithm directly.

The outer loop scans through s1. Whenever we encounter a character matching the beginning of s2, we attempt to build a full subsequence.

The forward scan searches for a complete match of s2. If the match succeeds, we know the current endpoint creates a valid window.

The backward shrinking phase minimizes the window. Starting from the endpoint, we walk backward through both strings to remove unnecessary characters from the left side.

Once the minimal valid start is identified, we compute the window size and update the best answer if needed.

Finally, we continue scanning for additional candidate windows.

Go Solution

func minWindow(s1 string, s2 string) string {
	n := len(s1)
	m := len(s2)

	bestStart := -1
	bestLength := n + 1

	i := 0

	for i < n {
		if s1[i] == s2[0] {
			s1Idx := i
			s2Idx := 0

			// Forward scan
			for s1Idx < n && s2Idx < m {
				if s1[s1Idx] == s2[s2Idx] {
					s2Idx++
				}
				s1Idx++
			}

			// Full subsequence matched
			if s2Idx == m {
				end := s1Idx - 1

				// Backward shrink
				s2Idx = m - 1
				s1Idx = end

				for s2Idx >= 0 {
					if s1[s1Idx] == s2[s2Idx] {
						s2Idx--
					}
					s1Idx--
				}

				start := s1Idx + 1
				windowLength := end - start + 1

				if windowLength < bestLength {
					bestLength = windowLength
					bestStart = start
				}

				i = start
			}
		}

		i++
	}

	if bestStart == -1 {
		return ""
	}

	return s1[bestStart : bestStart+bestLength]
}

The Go implementation closely mirrors the Python version.

Strings in Go are byte indexed, which works correctly here because the problem guarantees lowercase English letters.

Unlike Python, Go does not provide infinity values for integers conveniently, so we initialize bestLength using n + 1, which is larger than any possible valid window.

Returning an empty string in Go is simply "".

Worked Examples

Example 1

s1 = "abcdebdde"
s2 = "bde"

Forward Scan

Step s1 Index Character s2 Index Match
1 1 b 0 yes
2 2 c 1 no
3 3 d 1 yes
4 4 e 2 yes

Now the subsequence is fully matched.

Current window:

"abcde"

Ending index:

4

Backward Shrink

Move backward to minimize the window.

Step s1 Index Character s2 Index
1 4 e 2
2 3 d 1
3 2 c 0
4 1 b 0

Minimal window becomes:

"bcde"

Length:

4

Continue searching.

Later we find:

"bdde"

also length 4, but "bcde" starts earlier, so it remains the answer.

Example 2

s1 = "jmeqksfrsdcmsiwvaovztaqenprpvnbstl"
s2 = "u"

We scan all characters of s1.

Character 'u' never appears.

No valid subsequence window exists.

Return:

""

Complexity Analysis

Measure Complexity Explanation
Time O(n * m) Each scan involves matching at most m characters
Space O(1) Only pointer variables are used

Although the implementation contains nested loops, the important detail is that m is very small, at most 100. The forward and backward scans each process at most m meaningful matches per attempt.

No auxiliary arrays, hash maps, or dynamic programming tables are required, so the algorithm uses constant extra space.

Test Cases

sol = Solution()

assert sol.minWindow("abcdebdde", "bde") == "bcde"  # provided example
assert sol.minWindow("jmeqksfrsdcmsiwvaovztaqenprpvnbstl", "u") == ""  # no match

assert sol.minWindow("abc", "abc") == "abc"  # exact match
assert sol.minWindow("abc", "ac") == "abc"  # subsequence with skip
assert sol.minWindow("aaaaa", "aa") == "aa"  # repeated characters
assert sol.minWindow("abdbca", "abc") == "abdbc"  # minimal subsequence window
assert sol.minWindow("cnhczmccqouqadqtmjjzl", "mm") == "mccqouqadqtm"  # repeated target chars
assert sol.minWindow("abcdef", "f") == "f"  # single char at end
assert sol.minWindow("abcdef", "a") == "a"  # single char at beginning
assert sol.minWindow("abcdef", "z") == ""  # impossible target
assert sol.minWindow("bbbbbbbb", "bbb") == "bbb"  # many equivalent windows
assert sol.minWindow("abcde", "e") == "e"  # smallest possible window
assert sol.minWindow("abcde", "ace") == "abcde"  # full string needed
Test Why
"abcdebdde", "bde" Validates standard example
"u" missing from source Validates no-answer case
Exact full-string match Ensures direct equality works
"ac" inside "abc" Validates skipped characters
Repeated characters Tests duplicate handling
"abdbca", "abc" Tests shrinking logic
"mm" target Tests repeated target characters
Single char at end Boundary window
Single char at beginning Boundary window
Impossible target Ensures empty string returned
Many equivalent windows Validates leftmost tie-breaking
Window of length 1 Smallest valid substring
Entire string required Ensures no over-shrinking

Edge Cases

One important edge case occurs when no valid subsequence exists. For example, if s2 = "z" and s1 contains no 'z', the forward scan never completes successfully. A buggy implementation might accidentally return an uninitialized substring or produce an index error. This implementation safely tracks whether a valid window was ever found using best_start == -1.

Another important case involves repeated characters in s2. Consider:

s1 = "cnhczmccqouqadqtmjjzl"
s2 = "mm"

The algorithm must correctly distinguish between the two required 'm' characters. Incorrect pointer handling can accidentally reuse the same character twice. The forward scan guarantees that each match advances both pointers properly, preserving subsequence ordering.

A third tricky case is tie-breaking between windows of equal length. The problem requires returning the leftmost minimum-length window. Since the algorithm only updates the answer when a strictly smaller window is found, the first minimum-length window automatically remains the chosen answer. This correctly preserves leftmost priority.

Another subtle edge case occurs when the optimal window is the entire string itself. For example:

s1 = "abcde"
s2 = "ace"

The backward shrinking phase must stop correctly without overshooting the beginning of the valid subsequence. The implementation carefully reconstructs the minimal starting index using backward matching logic.