LeetCode 727 - Minimum Window Subsequence
This problem asks us to find the smallest contiguous substring of s1 such that s2 appears inside that substring as a subsequence. A subsequence does not require characters to be adjacent, but they must appear in the same relative order.
Difficulty: 🔴 Hard
Topics: String, Dynamic Programming, Sliding Window
Solution
Problem Understanding
This problem asks us to find the smallest contiguous substring of s1 such that s2 appears inside that substring as a subsequence.
A subsequence does not require characters to be adjacent, but they must appear in the same relative order. For example, "bde" is a subsequence of "bcde" because we can pick 'b', then 'd', then 'e' while preserving order.
The important detail is that the answer itself must be a contiguous substring of s1, even though the matching characters for s2 inside that substring do not need to be contiguous.
We are given two strings:
s1, the larger source strings2, the target subsequence
We must return:
- The shortest substring of
s1that containss2as a subsequence - If multiple substrings have the same minimum length, we return the one with the smallest starting index
- If no valid substring exists, we return
""
Consider the example:
s1 = "abcdebdde"
s2 = "bde"
The substring "bcde" works because:
'b'matches'd'appears later'e'appears later
The substring "bdde" also works, but both have length 4, and "bcde" appears earlier.
The constraints are important:
s1.length <= 2 * 10^4s2.length <= 100
This tells us that:
s1can be fairly larges2is relatively small
A fully quadratic or cubic solution over s1 would likely be too slow. However, algorithms involving O(len(s1) * len(s2)) are reasonable because s2 is small.
Several edge cases are important:
s2may contain characters not present ins1- Multiple minimum windows may exist
- The optimal window may appear near the end
s2may already equal an exact substrings2may require skipping many characters ins1- No valid subsequence may exist at all
A naive substring checking implementation can easily become too slow because there are O(n^2) substrings.
Approaches
Brute Force Approach
The brute force solution considers every possible substring of s1.
For each substring:
- Extract the substring
- Check whether
s2is a subsequence of it - If valid, compare its length with the current best answer
To check whether s2 is a subsequence, we use two pointers:
- One pointer scans the substring
- One pointer scans
s2
Whenever characters match, we advance the s2 pointer.
If we finish scanning s2, then the substring is valid.
This approach is correct because it explicitly checks every possible candidate window. Therefore, it cannot miss the optimal answer.
However, it is extremely inefficient.
There are O(n^2) substrings, and each subsequence check may take O(n) time. This leads to roughly O(n^3) complexity in the worst case, which is far too slow for n = 20,000.
Optimal Approach, Forward Scan + Backward Shrink
The key observation is that once we discover a valid subsequence ending at some position, we can shrink the window backward to find the smallest possible starting point for that ending index.
The algorithm works in two phases repeatedly:
- Forward scan:
- Find a complete subsequence match for
s2
- Backward scan:
- Move backward to minimize the window while preserving the subsequence
This avoids checking every substring explicitly.
The important insight is that after locating a valid ending position, we can greedily shrink the left boundary without losing validity.
Because s2 is small, each forward and backward scan is manageable.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n³) | O(1) | Checks every substring explicitly |
| Optimal | O(n * m) | O(1) | Uses forward subsequence matching and backward shrinking |
Here:
n = len(s1)m = len(s2)
Algorithm Walkthrough
Optimal Algorithm
- Initialize variables to track the best window found so far.
We store:
best_startbest_length
Initially, no valid window has been found.
2. Start scanning s1 from left to right.
Use pointer i for s1.
3. Whenever s1[i] matches the first character of s2, attempt a forward subsequence match.
We use:
s1_idxto scans1s2_idxto scans2
- Continue advancing through
s1.
Whenever:
s1[s1_idx] == s2[s2_idx]
increment s2_idx.
5. If s2_idx reaches len(s2), we found a valid subsequence.
The current s1_idx - 1 becomes the ending position of a valid window.
6. Perform a backward shrinking step.
Now we minimize the window while preserving the subsequence.
Set:
end = s1_idx - 1s2_idx = len(s2) - 1
- Move backward through
s1.
Whenever:
s1[s1_idx] == s2[s2_idx]
decrement s2_idx.
Continue until all characters of s2 are matched backward.
8. The position after the final backward move becomes the minimal starting index.
Compute:
startwindow_length
- Update the best answer if:
- The current window is shorter
- Or equal length but earlier
- Resume scanning after the start of the current window.
This avoids unnecessary repeated work.
Why it works
The forward scan guarantees that we locate a valid subsequence ending position. The backward scan then greedily removes unnecessary characters from the left side while still preserving the subsequence ordering.
Because the backward step always finds the earliest possible valid start for a fixed ending position, every discovered window is locally minimal. Comparing all such minimal windows guarantees the globally minimum answer.
Python Solution
class Solution:
def minWindow(self, s1: str, s2: str) -> str:
n = len(s1)
m = len(s2)
best_start = -1
best_length = float("inf")
i = 0
while i < n:
if s1[i] == s2[0]:
s1_idx = i
s2_idx = 0
# Forward scan to match entire s2
while s1_idx < n and s2_idx < m:
if s1[s1_idx] == s2[s2_idx]:
s2_idx += 1
s1_idx += 1
# Entire subsequence matched
if s2_idx == m:
end = s1_idx - 1
# Backward shrink
s2_idx = m - 1
s1_idx = end
while s2_idx >= 0:
if s1[s1_idx] == s2[s2_idx]:
s2_idx -= 1
s1_idx -= 1
start = s1_idx + 1
window_length = end - start + 1
if window_length < best_length:
best_length = window_length
best_start = start
# Restart from next position after start
i = start
i += 1
if best_start == -1:
return ""
return s1[best_start:best_start + best_length]
The implementation follows the algorithm directly.
The outer loop scans through s1. Whenever we encounter a character matching the beginning of s2, we attempt to build a full subsequence.
The forward scan searches for a complete match of s2. If the match succeeds, we know the current endpoint creates a valid window.
The backward shrinking phase minimizes the window. Starting from the endpoint, we walk backward through both strings to remove unnecessary characters from the left side.
Once the minimal valid start is identified, we compute the window size and update the best answer if needed.
Finally, we continue scanning for additional candidate windows.
Go Solution
func minWindow(s1 string, s2 string) string {
n := len(s1)
m := len(s2)
bestStart := -1
bestLength := n + 1
i := 0
for i < n {
if s1[i] == s2[0] {
s1Idx := i
s2Idx := 0
// Forward scan
for s1Idx < n && s2Idx < m {
if s1[s1Idx] == s2[s2Idx] {
s2Idx++
}
s1Idx++
}
// Full subsequence matched
if s2Idx == m {
end := s1Idx - 1
// Backward shrink
s2Idx = m - 1
s1Idx = end
for s2Idx >= 0 {
if s1[s1Idx] == s2[s2Idx] {
s2Idx--
}
s1Idx--
}
start := s1Idx + 1
windowLength := end - start + 1
if windowLength < bestLength {
bestLength = windowLength
bestStart = start
}
i = start
}
}
i++
}
if bestStart == -1 {
return ""
}
return s1[bestStart : bestStart+bestLength]
}
The Go implementation closely mirrors the Python version.
Strings in Go are byte indexed, which works correctly here because the problem guarantees lowercase English letters.
Unlike Python, Go does not provide infinity values for integers conveniently, so we initialize bestLength using n + 1, which is larger than any possible valid window.
Returning an empty string in Go is simply "".
Worked Examples
Example 1
s1 = "abcdebdde"
s2 = "bde"
Forward Scan
| Step | s1 Index | Character | s2 Index | Match |
|---|---|---|---|---|
| 1 | 1 | b | 0 | yes |
| 2 | 2 | c | 1 | no |
| 3 | 3 | d | 1 | yes |
| 4 | 4 | e | 2 | yes |
Now the subsequence is fully matched.
Current window:
"abcde"
Ending index:
4
Backward Shrink
Move backward to minimize the window.
| Step | s1 Index | Character | s2 Index |
|---|---|---|---|
| 1 | 4 | e | 2 |
| 2 | 3 | d | 1 |
| 3 | 2 | c | 0 |
| 4 | 1 | b | 0 |
Minimal window becomes:
"bcde"
Length:
4
Continue searching.
Later we find:
"bdde"
also length 4, but "bcde" starts earlier, so it remains the answer.
Example 2
s1 = "jmeqksfrsdcmsiwvaovztaqenprpvnbstl"
s2 = "u"
We scan all characters of s1.
Character 'u' never appears.
No valid subsequence window exists.
Return:
""
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n * m) | Each scan involves matching at most m characters |
| Space | O(1) | Only pointer variables are used |
Although the implementation contains nested loops, the important detail is that m is very small, at most 100. The forward and backward scans each process at most m meaningful matches per attempt.
No auxiliary arrays, hash maps, or dynamic programming tables are required, so the algorithm uses constant extra space.
Test Cases
sol = Solution()
assert sol.minWindow("abcdebdde", "bde") == "bcde" # provided example
assert sol.minWindow("jmeqksfrsdcmsiwvaovztaqenprpvnbstl", "u") == "" # no match
assert sol.minWindow("abc", "abc") == "abc" # exact match
assert sol.minWindow("abc", "ac") == "abc" # subsequence with skip
assert sol.minWindow("aaaaa", "aa") == "aa" # repeated characters
assert sol.minWindow("abdbca", "abc") == "abdbc" # minimal subsequence window
assert sol.minWindow("cnhczmccqouqadqtmjjzl", "mm") == "mccqouqadqtm" # repeated target chars
assert sol.minWindow("abcdef", "f") == "f" # single char at end
assert sol.minWindow("abcdef", "a") == "a" # single char at beginning
assert sol.minWindow("abcdef", "z") == "" # impossible target
assert sol.minWindow("bbbbbbbb", "bbb") == "bbb" # many equivalent windows
assert sol.minWindow("abcde", "e") == "e" # smallest possible window
assert sol.minWindow("abcde", "ace") == "abcde" # full string needed
| Test | Why |
|---|---|
"abcdebdde", "bde" |
Validates standard example |
"u" missing from source |
Validates no-answer case |
| Exact full-string match | Ensures direct equality works |
"ac" inside "abc" |
Validates skipped characters |
| Repeated characters | Tests duplicate handling |
"abdbca", "abc" |
Tests shrinking logic |
"mm" target |
Tests repeated target characters |
| Single char at end | Boundary window |
| Single char at beginning | Boundary window |
| Impossible target | Ensures empty string returned |
| Many equivalent windows | Validates leftmost tie-breaking |
| Window of length 1 | Smallest valid substring |
| Entire string required | Ensures no over-shrinking |
Edge Cases
One important edge case occurs when no valid subsequence exists. For example, if s2 = "z" and s1 contains no 'z', the forward scan never completes successfully. A buggy implementation might accidentally return an uninitialized substring or produce an index error. This implementation safely tracks whether a valid window was ever found using best_start == -1.
Another important case involves repeated characters in s2. Consider:
s1 = "cnhczmccqouqadqtmjjzl"
s2 = "mm"
The algorithm must correctly distinguish between the two required 'm' characters. Incorrect pointer handling can accidentally reuse the same character twice. The forward scan guarantees that each match advances both pointers properly, preserving subsequence ordering.
A third tricky case is tie-breaking between windows of equal length. The problem requires returning the leftmost minimum-length window. Since the algorithm only updates the answer when a strictly smaller window is found, the first minimum-length window automatically remains the chosen answer. This correctly preserves leftmost priority.
Another subtle edge case occurs when the optimal window is the entire string itself. For example:
s1 = "abcde"
s2 = "ace"
The backward shrinking phase must stop correctly without overshooting the beginning of the valid subsequence. The implementation carefully reconstructs the minimal starting index using backward matching logic.