LeetCode 76 - Minimum Window Substring

Difficulty: 🔴 Hard
Topics: Hash Table, String, Sliding Window

Solution

Problem Understanding

The problem asks us to find the smallest contiguous substring inside string s that contains every character from string t, including duplicate occurrences.

A substring must consist of consecutive characters. We are not allowed to reorder characters or skip positions. The window must contain every required character with at least the same frequency as it appears in t.

For example, if:

t = "AABC"

then a valid window must contain:

two 'A'
one 'B'
one 'C'

A window like "ABC" would not be valid because it only contains one 'A'.

The input consists of two strings:

s, the source string where we search for the window
t, the target string that defines the required characters

The expected output is the shortest substring of s that satisfies all requirements. If no such substring exists, we return an empty string.

The constraints are important:

1 <= m, n <= 10^5
both strings can be very large

This immediately tells us that algorithms with quadratic or cubic complexity will not be fast enough. We need something close to linear time.

There are several important edge cases:

t may contain duplicate characters
s may be shorter than t
no valid window may exist
the minimum window may be the entire string
there may be many valid windows, but the answer is guaranteed to be unique
uppercase and lowercase letters are different characters

A naive implementation often fails when handling duplicate counts correctly. For example:

s = "AAABBC"
t = "AABC"

A valid window must contain two 'A', not just one.

Approaches

Brute Force Approach

The brute force approach is to generate every possible substring of s and check whether that substring contains all characters from t.

For every starting index i, we can try every ending index j and compute the substring:

s[i:j+1]

For each substring, we count character frequencies and compare them with the frequencies required by t.

This works because eventually every possible substring is examined, so the minimum valid one will be found.

However, the performance is extremely poor.

There are O(m^2) substrings, and validating each substring can take O(m) time if we count frequencies repeatedly. The overall complexity can become O(m^3) in the worst case.

Even with optimizations, it is still far too slow for m = 10^5.

Optimal Sliding Window Approach

The key observation is that we do not need to repeatedly recompute information for every substring.

Instead, we can maintain a moving window using two pointers:

a left pointer
a right pointer

As the right pointer expands the window, we add characters into a frequency map.

Once the current window becomes valid, meaning it contains all required characters, we try to shrink it from the left side while preserving validity.

This works because:

expanding the window increases available characters
shrinking the window helps minimize the answer
every character enters and leaves the window at most once

This leads to an O(m + n) solution.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(m³)	O(1) to O(k)	Checks all substrings repeatedly
Optimal	O(m + n)	O(k)	Sliding window with frequency counting

Here:

m = len(s)
n = len(t)
k is the number of distinct characters

Algorithm Walkthrough

Create a frequency map for string t.

This map stores how many times each character must appear in a valid window.

Example:

t = "AABC"

becomes:

{
    'A': 2,
    'B': 1,
    'C': 1
}

Initialize another frequency map for the current window.

This tracks how many times characters currently appear inside the active substring. 3. Maintain two pointers:

left, the start of the window
right, the end of the window

Initially both start at 0. 4. Expand the window by moving right.

At each step:

add s[right] into the window map
check whether this character now satisfies one of the required frequencies

Track how many character requirements are currently satisfied.

Suppose:

required = number of distinct characters in t

We also maintain:

formed = number of distinct characters currently satisfied

A character is considered satisfied when:

window_count[char] == target_count[char]

Once all requirements are satisfied, the current window is valid.

This happens when:

formed == required

Try to shrink the window from the left side.

While the window remains valid:

update the best answer if the current window is smaller
remove s[left] from the window
if removing it breaks a required frequency, decrement formed
move left forward

Continue expanding the right pointer until the entire string has been processed.
Return the best window found.

If no valid window exists, return "".

Why it works

The algorithm maintains an invariant:

whenever formed == required, the current window contains all required characters with correct frequencies

The left pointer only moves when the window is already valid, which guarantees that every candidate considered is valid. By shrinking greedily, we ensure the smallest valid window for each right boundary is examined.

Because every character is processed at most twice, once when entering the window and once when leaving it, the algorithm runs in linear time.

Python Solution

from collections import Counter, defaultdict

class Solution:
    def minWindow(self, s: str, t: str) -> str:
        if len(t) > len(s):
            return ""

        target_count = Counter(t)
        window_count = defaultdict(int)

        required = len(target_count)
        formed = 0

        left = 0

        best_length = float("inf")
        best_start = 0

        for right in range(len(s)):
            char = s[right]
            window_count[char] += 1

            if (
                char in target_count
                and window_count[char] == target_count[char]
            ):
                formed += 1

            while formed == required:
                current_length = right - left + 1

                if current_length < best_length:
                    best_length = current_length
                    best_start = left

                left_char = s[left]
                window_count[left_char] -= 1

                if (
                    left_char in target_count
                    and window_count[left_char] < target_count[left_char]
                ):
                    formed -= 1

                left += 1

        if best_length == float("inf"):
            return ""

        return s[best_start:best_start + best_length]

The implementation begins by handling the simplest impossible case. If t is longer than s, no valid substring can exist.

Counter(t) stores the required frequencies for every character. A defaultdict(int) is used for the current window so missing keys automatically default to zero.

The variable required stores how many distinct characters must be satisfied. The variable formed tracks how many are currently satisfied inside the active window.

The right pointer expands the window one character at a time. Whenever a character frequency becomes exactly equal to its required count, we increment formed.

Once the window becomes valid, the inner while loop tries to shrink it from the left side. During shrinking:

we update the best answer
we decrement the left character count
we detect whether the window becomes invalid again

The smallest valid window encountered during processing is stored using:

best_length
best_start

Finally, we return the corresponding substring.

Go Solution

func minWindow(s string, t string) string {
	if len(t) > len(s) {
		return ""
	}

	targetCount := make(map[byte]int)
	for i := 0; i < len(t); i++ {
		targetCount[t[i]]++
	}

	windowCount := make(map[byte]int)

	required := len(targetCount)
	formed := 0

	left := 0

	bestLength := len(s) + 1
	bestStart := 0

	for right := 0; right < len(s); right++ {
		char := s[right]
		windowCount[char]++

		if count, exists := targetCount[char]; exists &&
			windowCount[char] == count {
			formed++
		}

		for formed == required {
			currentLength := right - left + 1

			if currentLength < bestLength {
				bestLength = currentLength
				bestStart = left
			}

			leftChar := s[left]
			windowCount[leftChar]--

			if count, exists := targetCount[leftChar]; exists &&
				windowCount[leftChar] < count {
				formed--
			}

			left++
		}
	}

	if bestLength == len(s)+1 {
		return ""
	}

	return s[bestStart : bestStart+bestLength]
}

The Go implementation follows the same logic as the Python solution, but uses map[byte]int instead of dictionaries.

Since the input only contains English letters, indexing by byte is safe and efficient.

Go does not have built in Counter or defaultdict structures, so regular maps are used instead. Missing keys default to zero automatically when accessed.

The substring result is returned using slice syntax:

s[start:end]

No integer overflow concerns exist here because all indices remain within string bounds.

Worked Examples

Example 1

s = "ADOBECODEBANC"
t = "ABC"

Target frequencies:

A:1 B:1 C:1

Step	Right Char	Window	Formed	Action
0	A	A	1	A satisfied
1	D	AD	1	continue
2	O	ADO	1	continue
3	B	ADOB	2	B satisfied
4	E	ADOBE	2	continue
5	C	ADOBEC	3	valid window found
shrink	remove A	DOBEC	2	no longer valid

Current best:

"ADOBEC"

Continue expanding:

Step	Right Char	Window	Formed	Action
6	O	DOBECO	2	continue
7	D	DOBECOD	2	continue
8	E	DOBECODE	2	continue
9	B	DOBECODEB	2	continue
10	A	DOBECODEBA	3	valid again

Now shrink repeatedly:

Window	Valid
DOBECODEBA	yes
OBECODEBA	yes
BECODEBA	yes
ECODEBA	no

Best remains length 6.

Continue:

Step	Right Char	Window	Formed
11	N	CODEBAN	2
12	C	CODEBANC	3

Shrink:

Window	Valid
CODEBANC	yes
ODEBANC	yes
DEBANC	yes
EBANC	no
BANC	yes

Best becomes:

"BANC"

Final answer:

"BANC"

Example 2

s = "a"
t = "a"

Step	Window	Formed	Result
add a	a	1	valid

Minimum window:

"a"

Example 3

s = "a"
t = "aa"

Required:

A:2

Window never reaches frequency 2.

No valid substring exists.

Return:

""

Complexity Analysis

Measure	Complexity	Explanation
Time	O(m + n)	Each pointer moves across the string at most once
Space	O(k)	Hash maps store character frequencies

The frequency map for t requires O(k) space where k is the number of distinct characters.

The sliding window ensures each character is processed at most twice:

once when entering the window
once when leaving the window

This gives true linear time complexity.

Test Cases

solution = Solution()

# Provided examples
assert solution.minWindow("ADOBECODEBANC", "ABC") == "BANC"  # standard example
assert solution.minWindow("a", "a") == "a"  # exact match
assert solution.minWindow("a", "aa") == ""  # impossible case

# Single character cases
assert solution.minWindow("b", "b") == "b"  # single valid character
assert solution.minWindow("b", "a") == ""  # single invalid character

# Duplicate character requirements
assert solution.minWindow("AAABBC", "AABC") == "AABBC"  # duplicates required
assert solution.minWindow("aaflslflsldkalskaaa", "aaa") == "aaa"  # repeated duplicates

# Entire string is answer
assert solution.minWindow("ABC", "ABC") == "ABC"  # whole string needed

# Window at beginning
assert solution.minWindow("ABCD", "AB") == "AB"  # answer at start

# Window at end
assert solution.minWindow("XYZABC", "ABC") == "ABC"  # answer at end

# Multiple valid windows
assert solution.minWindow("abdabca", "abc") == "abc"  # smallest chosen

# No possible window
assert solution.minWindow("abcdef", "xyz") == ""  # disjoint characters

# Case sensitivity
assert solution.minWindow("aA", "Aa") == "aA"  # uppercase/lowercase distinct

# Large repeated patterns
assert solution.minWindow("aaaaaaaaabbbbbcdd", "abcdd") == "abbbbbcdd"  # frequency tracking

print("All test cases passed.")

Test	Why
`"ADOBECODEBANC", "ABC"`	Standard sliding window scenario
`"a", "a"`	Smallest valid input
`"a", "aa"`	Impossible frequency requirement
`"b", "b"`	Single character success
`"b", "a"`	Single character failure
`"AAABBC", "AABC"`	Duplicate character handling
`"aaflslflsldkalskaaa", "aaa"`	Repeated required characters
`"ABC", "ABC"`	Entire string is minimum window
`"ABCD", "AB"`	Minimum window at start
`"XYZABC", "ABC"`	Minimum window at end
`"abdabca", "abc"`	Multiple candidate windows
`"abcdef", "xyz"`	No overlapping characters
`"aA", "Aa"`	Case sensitivity correctness
`"aaaaaaaaabbbbbcdd", "abcdd"`	Frequency counting stress case

Edge Cases

One important edge case occurs when t contains duplicate characters. Many incorrect solutions only check whether characters exist, instead of checking their frequencies. For example:

s = "ABC"
t = "AABC"

Even though all distinct letters appear, the window is invalid because two 'A' characters are required. The implementation correctly handles this by comparing exact frequency counts using hash maps.

Another important edge case occurs when no valid window exists. This can happen when s is shorter than t, or when required characters are missing entirely. The algorithm handles this naturally because formed never reaches required, so no answer is recorded. In that case, the function returns an empty string.

Case sensitivity is also important. Uppercase and lowercase English letters are treated as different characters. For example:

s = "aA"
t = "Aa"

The algorithm correctly distinguishes between 'a' and 'A' because hash maps use the exact character as the key.

A final subtle edge case occurs when shrinking the window. If we remove a character that was exactly satisfying a requirement, the window immediately becomes invalid. The implementation carefully detects this condition:

window_count[left_char] < target_count[left_char]

This ensures the algorithm never incorrectly treats an invalid window as valid.