LeetCode 76 - Minimum Window Substring
The problem asks us to find the smallest contiguous substring inside string s that contains every character from string t, including duplicate occurrences. A substring must consist of consecutive characters. We are not allowed to reorder characters or skip positions.
Difficulty: 🔴 Hard
Topics: Hash Table, String, Sliding Window
Solution
Problem Understanding
The problem asks us to find the smallest contiguous substring inside string s that contains every character from string t, including duplicate occurrences.
A substring must consist of consecutive characters. We are not allowed to reorder characters or skip positions. The window must contain every required character with at least the same frequency as it appears in t.
For example, if:
t = "AABC"
then a valid window must contain:
- two
'A' - one
'B' - one
'C'
A window like "ABC" would not be valid because it only contains one 'A'.
The input consists of two strings:
s, the source string where we search for the windowt, the target string that defines the required characters
The expected output is the shortest substring of s that satisfies all requirements. If no such substring exists, we return an empty string.
The constraints are important:
1 <= m, n <= 10^5- both strings can be very large
This immediately tells us that algorithms with quadratic or cubic complexity will not be fast enough. We need something close to linear time.
There are several important edge cases:
tmay contain duplicate characterssmay be shorter thant- no valid window may exist
- the minimum window may be the entire string
- there may be many valid windows, but the answer is guaranteed to be unique
- uppercase and lowercase letters are different characters
A naive implementation often fails when handling duplicate counts correctly. For example:
s = "AAABBC"
t = "AABC"
A valid window must contain two 'A', not just one.
Approaches
Brute Force Approach
The brute force approach is to generate every possible substring of s and check whether that substring contains all characters from t.
For every starting index i, we can try every ending index j and compute the substring:
s[i:j+1]
For each substring, we count character frequencies and compare them with the frequencies required by t.
This works because eventually every possible substring is examined, so the minimum valid one will be found.
However, the performance is extremely poor.
There are O(m^2) substrings, and validating each substring can take O(m) time if we count frequencies repeatedly. The overall complexity can become O(m^3) in the worst case.
Even with optimizations, it is still far too slow for m = 10^5.
Optimal Sliding Window Approach
The key observation is that we do not need to repeatedly recompute information for every substring.
Instead, we can maintain a moving window using two pointers:
- a left pointer
- a right pointer
As the right pointer expands the window, we add characters into a frequency map.
Once the current window becomes valid, meaning it contains all required characters, we try to shrink it from the left side while preserving validity.
This works because:
- expanding the window increases available characters
- shrinking the window helps minimize the answer
- every character enters and leaves the window at most once
This leads to an O(m + n) solution.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(m³) | O(1) to O(k) | Checks all substrings repeatedly |
| Optimal | O(m + n) | O(k) | Sliding window with frequency counting |
Here:
m = len(s)n = len(t)kis the number of distinct characters
Algorithm Walkthrough
- Create a frequency map for string
t.
This map stores how many times each character must appear in a valid window.
Example:
t = "AABC"
becomes:
{
'A': 2,
'B': 1,
'C': 1
}
- Initialize another frequency map for the current window.
This tracks how many times characters currently appear inside the active substring. 3. Maintain two pointers:
left, the start of the windowright, the end of the window
Initially both start at 0.
4. Expand the window by moving right.
At each step:
- add
s[right]into the window map - check whether this character now satisfies one of the required frequencies
- Track how many character requirements are currently satisfied.
Suppose:
required = number of distinct characters in t
We also maintain:
formed = number of distinct characters currently satisfied
A character is considered satisfied when:
window_count[char] == target_count[char]
- Once all requirements are satisfied, the current window is valid.
This happens when:
formed == required
- Try to shrink the window from the left side.
While the window remains valid:
- update the best answer if the current window is smaller
- remove
s[left]from the window - if removing it breaks a required frequency, decrement
formed - move
leftforward
- Continue expanding the right pointer until the entire string has been processed.
- Return the best window found.
If no valid window exists, return "".
Why it works
The algorithm maintains an invariant:
- whenever
formed == required, the current window contains all required characters with correct frequencies
The left pointer only moves when the window is already valid, which guarantees that every candidate considered is valid. By shrinking greedily, we ensure the smallest valid window for each right boundary is examined.
Because every character is processed at most twice, once when entering the window and once when leaving it, the algorithm runs in linear time.
Python Solution
from collections import Counter, defaultdict
class Solution:
def minWindow(self, s: str, t: str) -> str:
if len(t) > len(s):
return ""
target_count = Counter(t)
window_count = defaultdict(int)
required = len(target_count)
formed = 0
left = 0
best_length = float("inf")
best_start = 0
for right in range(len(s)):
char = s[right]
window_count[char] += 1
if (
char in target_count
and window_count[char] == target_count[char]
):
formed += 1
while formed == required:
current_length = right - left + 1
if current_length < best_length:
best_length = current_length
best_start = left
left_char = s[left]
window_count[left_char] -= 1
if (
left_char in target_count
and window_count[left_char] < target_count[left_char]
):
formed -= 1
left += 1
if best_length == float("inf"):
return ""
return s[best_start:best_start + best_length]
The implementation begins by handling the simplest impossible case. If t is longer than s, no valid substring can exist.
Counter(t) stores the required frequencies for every character. A defaultdict(int) is used for the current window so missing keys automatically default to zero.
The variable required stores how many distinct characters must be satisfied. The variable formed tracks how many are currently satisfied inside the active window.
The right pointer expands the window one character at a time. Whenever a character frequency becomes exactly equal to its required count, we increment formed.
Once the window becomes valid, the inner while loop tries to shrink it from the left side. During shrinking:
- we update the best answer
- we decrement the left character count
- we detect whether the window becomes invalid again
The smallest valid window encountered during processing is stored using:
best_lengthbest_start
Finally, we return the corresponding substring.
Go Solution
func minWindow(s string, t string) string {
if len(t) > len(s) {
return ""
}
targetCount := make(map[byte]int)
for i := 0; i < len(t); i++ {
targetCount[t[i]]++
}
windowCount := make(map[byte]int)
required := len(targetCount)
formed := 0
left := 0
bestLength := len(s) + 1
bestStart := 0
for right := 0; right < len(s); right++ {
char := s[right]
windowCount[char]++
if count, exists := targetCount[char]; exists &&
windowCount[char] == count {
formed++
}
for formed == required {
currentLength := right - left + 1
if currentLength < bestLength {
bestLength = currentLength
bestStart = left
}
leftChar := s[left]
windowCount[leftChar]--
if count, exists := targetCount[leftChar]; exists &&
windowCount[leftChar] < count {
formed--
}
left++
}
}
if bestLength == len(s)+1 {
return ""
}
return s[bestStart : bestStart+bestLength]
}
The Go implementation follows the same logic as the Python solution, but uses map[byte]int instead of dictionaries.
Since the input only contains English letters, indexing by byte is safe and efficient.
Go does not have built in Counter or defaultdict structures, so regular maps are used instead. Missing keys default to zero automatically when accessed.
The substring result is returned using slice syntax:
s[start:end]
No integer overflow concerns exist here because all indices remain within string bounds.
Worked Examples
Example 1
s = "ADOBECODEBANC"
t = "ABC"
Target frequencies:
A:1 B:1 C:1
| Step | Right Char | Window | Formed | Action |
|---|---|---|---|---|
| 0 | A | A | 1 | A satisfied |
| 1 | D | AD | 1 | continue |
| 2 | O | ADO | 1 | continue |
| 3 | B | ADOB | 2 | B satisfied |
| 4 | E | ADOBE | 2 | continue |
| 5 | C | ADOBEC | 3 | valid window found |
| shrink | remove A | DOBEC | 2 | no longer valid |
Current best:
"ADOBEC"
Continue expanding:
| Step | Right Char | Window | Formed | Action |
|---|---|---|---|---|
| 6 | O | DOBECO | 2 | continue |
| 7 | D | DOBECOD | 2 | continue |
| 8 | E | DOBECODE | 2 | continue |
| 9 | B | DOBECODEB | 2 | continue |
| 10 | A | DOBECODEBA | 3 | valid again |
Now shrink repeatedly:
| Window | Valid |
|---|---|
| DOBECODEBA | yes |
| OBECODEBA | yes |
| BECODEBA | yes |
| ECODEBA | no |
Best remains length 6.
Continue:
| Step | Right Char | Window | Formed |
|---|---|---|---|
| 11 | N | CODEBAN | 2 |
| 12 | C | CODEBANC | 3 |
Shrink:
| Window | Valid |
|---|---|
| CODEBANC | yes |
| ODEBANC | yes |
| DEBANC | yes |
| EBANC | no |
| BANC | yes |
Best becomes:
"BANC"
Final answer:
"BANC"
Example 2
s = "a"
t = "a"
| Step | Window | Formed | Result |
|---|---|---|---|
| add a | a | 1 | valid |
Minimum window:
"a"
Example 3
s = "a"
t = "aa"
Required:
A:2
Window never reaches frequency 2.
No valid substring exists.
Return:
""
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(m + n) | Each pointer moves across the string at most once |
| Space | O(k) | Hash maps store character frequencies |
The frequency map for t requires O(k) space where k is the number of distinct characters.
The sliding window ensures each character is processed at most twice:
- once when entering the window
- once when leaving the window
This gives true linear time complexity.
Test Cases
solution = Solution()
# Provided examples
assert solution.minWindow("ADOBECODEBANC", "ABC") == "BANC" # standard example
assert solution.minWindow("a", "a") == "a" # exact match
assert solution.minWindow("a", "aa") == "" # impossible case
# Single character cases
assert solution.minWindow("b", "b") == "b" # single valid character
assert solution.minWindow("b", "a") == "" # single invalid character
# Duplicate character requirements
assert solution.minWindow("AAABBC", "AABC") == "AABBC" # duplicates required
assert solution.minWindow("aaflslflsldkalskaaa", "aaa") == "aaa" # repeated duplicates
# Entire string is answer
assert solution.minWindow("ABC", "ABC") == "ABC" # whole string needed
# Window at beginning
assert solution.minWindow("ABCD", "AB") == "AB" # answer at start
# Window at end
assert solution.minWindow("XYZABC", "ABC") == "ABC" # answer at end
# Multiple valid windows
assert solution.minWindow("abdabca", "abc") == "abc" # smallest chosen
# No possible window
assert solution.minWindow("abcdef", "xyz") == "" # disjoint characters
# Case sensitivity
assert solution.minWindow("aA", "Aa") == "aA" # uppercase/lowercase distinct
# Large repeated patterns
assert solution.minWindow("aaaaaaaaabbbbbcdd", "abcdd") == "abbbbbcdd" # frequency tracking
print("All test cases passed.")
| Test | Why |
|---|---|
"ADOBECODEBANC", "ABC" |
Standard sliding window scenario |
"a", "a" |
Smallest valid input |
"a", "aa" |
Impossible frequency requirement |
"b", "b" |
Single character success |
"b", "a" |
Single character failure |
"AAABBC", "AABC" |
Duplicate character handling |
"aaflslflsldkalskaaa", "aaa" |
Repeated required characters |
"ABC", "ABC" |
Entire string is minimum window |
"ABCD", "AB" |
Minimum window at start |
"XYZABC", "ABC" |
Minimum window at end |
"abdabca", "abc" |
Multiple candidate windows |
"abcdef", "xyz" |
No overlapping characters |
"aA", "Aa" |
Case sensitivity correctness |
"aaaaaaaaabbbbbcdd", "abcdd" |
Frequency counting stress case |
Edge Cases
One important edge case occurs when t contains duplicate characters. Many incorrect solutions only check whether characters exist, instead of checking their frequencies. For example:
s = "ABC"
t = "AABC"
Even though all distinct letters appear, the window is invalid because two 'A' characters are required. The implementation correctly handles this by comparing exact frequency counts using hash maps.
Another important edge case occurs when no valid window exists. This can happen when s is shorter than t, or when required characters are missing entirely. The algorithm handles this naturally because formed never reaches required, so no answer is recorded. In that case, the function returns an empty string.
Case sensitivity is also important. Uppercase and lowercase English letters are treated as different characters. For example:
s = "aA"
t = "Aa"
The algorithm correctly distinguishes between 'a' and 'A' because hash maps use the exact character as the key.
A final subtle edge case occurs when shrinking the window. If we remove a character that was exactly satisfying a requirement, the window immediately becomes invalid. The implementation carefully detects this condition:
window_count[left_char] < target_count[left_char]
This ensures the algorithm never incorrectly treats an invalid window as valid.