LeetCode 44 - Wildcard Matching

This problem asks us to determine whether an entire input string s matches a wildcard pattern p. The pattern supports two special wildcard characters: - '?' matches exactly one character. - '' matches any sequence of characters, including an empty sequence.

LeetCode Problem 44

Difficulty: 🔴 Hard
Topics: String, Dynamic Programming, Greedy, Recursion

Solution

Problem Understanding

This problem asks us to determine whether an entire input string s matches a wildcard pattern p.

The pattern supports two special wildcard characters:

  • '?' matches exactly one character.
  • '*' matches any sequence of characters, including an empty sequence.

The important detail is that the match must cover the entire string. Partial matches are not allowed. Every character in s must be consumed by the pattern, and every character in the pattern must also be accounted for.

For example:

  • s = "aa" and p = "a" does not match because the pattern only covers one character while the string contains two.
  • s = "aa" and p = "*" matches because '*' can represent the entire string.
  • s = "cb" and p = "?a" does not match because '?' matches 'c', but 'a' does not match 'b'.

The constraints are important:

  • 0 <= s.length, p.length <= 2000

A brute-force recursive solution would explode exponentially because every '*' can represent many different substring lengths. Since both strings can be as large as 2000 characters, we need something much more efficient.

Several edge cases are especially important:

  • Empty string and empty pattern.
  • Pattern containing only '*'.
  • Multiple consecutive '*' characters.
  • Patterns longer than the string.
  • '*' matching zero characters.
  • '*' matching many characters.

A naive implementation often fails when backtracking is required, especially with combinations of normal characters, '?', and '*'.

Approaches

Brute Force Approach

The brute-force solution uses recursion to try every possible interpretation of '*'.

At each position:

  • If characters match directly, or the pattern character is '?', move both pointers forward.

  • If the pattern character is '*', try all possibilities:

  • Match zero characters.

  • Match one character.

  • Match two characters.

  • Continue until the end of the string.

This approach is correct because it explores every valid interpretation of the wildcard pattern. If any recursive path successfully consumes both the string and the pattern, the answer is true.

The problem is that '*' creates branching recursion. A pattern with many stars causes an enormous number of repeated subproblems. The complexity becomes exponential, which is far too slow for strings of length 2000.

Optimal Greedy Approach

The key insight is that we do not actually need to try every possible expansion of '*' immediately.

Instead, we can process the string greedily while remembering:

  • The most recent '*' position in the pattern.
  • The position in the string where that '*' started matching.

When characters match normally, both pointers advance.

When we encounter '*', we record its position and move forward in the pattern. Initially, we let the '*' match an empty sequence.

If we later hit a mismatch, we backtrack to the most recent '*' and extend its match by one more character.

This works because '*' is the only wildcard capable of absorbing arbitrary lengths. By greedily expanding it only when necessary, we avoid exponential branching.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^(n+m)) O(n+m) Tries every possible expansion of '*'
Optimal O(n + m) O(1) Greedy matching with controlled backtracking

Algorithm Walkthrough

Step 1: Initialize pointers

We maintain two indices:

  • s_index for the input string.
  • p_index for the pattern.

We also maintain:

  • star_index, the position of the most recent '*' in the pattern.
  • match_index, the position in the string corresponding to that '*'.

Initially:

  • star_index = -1
  • match_index = -1

A value of -1 means we have not seen a '*' yet.

Step 2: Process characters while the string remains

As long as s_index < len(s):

We compare the current pattern character with the current string character.

Step 3: Handle direct match or '?'

If:

  • p[p_index] == s[s_index]
  • or p[p_index] == '?'

then the current characters are compatible.

Advance both pointers because one pattern character consumes one string character.

Step 4: Handle '*'

If the current pattern character is '*':

  1. Save the star position:
  • star_index = p_index
  1. Save the current string position:
  • match_index = s_index
  1. Move the pattern pointer forward.

At this moment, we assume the '*' matches zero characters.

Step 5: Handle mismatch with previous '*'

If characters do not match, but we previously encountered a '*':

  1. Return to the character after the star:
  • p_index = star_index + 1
  1. Extend the star match by one character:
  • match_index += 1
  1. Restart matching from:
  • s_index = match_index

This effectively means:

"The previous '*' should absorb one more character."

Step 6: Handle impossible mismatch

If there is a mismatch and no earlier '*' exists, the match is impossible.

Return False.

Step 7: Consume remaining stars

After the string is fully processed, the remaining pattern characters must all be '*'.

Skip them.

Step 8: Final validation

If the pattern is also fully consumed, return True.

Otherwise return False.

Why it works

The algorithm maintains the invariant that all characters before s_index and p_index are already matched correctly.

Whenever a mismatch occurs, the only flexible construct capable of repairing the mismatch is the most recent '*'. By incrementally expanding that star's coverage, we systematically explore all necessary possibilities without rechecking earlier successful matches.

Since each pointer only moves forward overall, the algorithm remains linear.

Python Solution

class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        s_index = 0
        p_index = 0

        star_index = -1
        match_index = -1

        while s_index < len(s):
            # Direct match or '?'
            if (
                p_index < len(p)
                and (p[p_index] == s[s_index] or p[p_index] == '?')
            ):
                s_index += 1
                p_index += 1

            # Found '*'
            elif p_index < len(p) and p[p_index] == '*':
                star_index = p_index
                match_index = s_index
                p_index += 1

            # Previous '*' can absorb one more character
            elif star_index != -1:
                p_index = star_index + 1
                match_index += 1
                s_index = match_index

            # No match possible
            else:
                return False

        # Remaining pattern characters must all be '*'
        while p_index < len(p) and p[p_index] == '*':
            p_index += 1

        return p_index == len(p)

The implementation follows the greedy strategy exactly.

The variables s_index and p_index track our current positions in the string and pattern.

When characters match directly, or the pattern contains '?', both pointers advance because one pattern character consumes one string character.

When a '*' appears, the code stores two pieces of information:

  • star_index, where the star occurred.
  • match_index, where that star began matching in the string.

Initially, the star is treated as matching zero characters.

If a later mismatch occurs, the algorithm revisits the most recent star and increases its coverage by one character. This avoids recursive branching while still exploring all valid possibilities.

Finally, after the string is exhausted, any remaining pattern characters must all be '*' because only stars can represent an empty suffix.

Go Solution

func isMatch(s string, p string) bool {
	sIndex := 0
	pIndex := 0

	starIndex := -1
	matchIndex := -1

	for sIndex < len(s) {

		// Direct match or '?'
		if pIndex < len(p) &&
			(p[pIndex] == s[sIndex] || p[pIndex] == '?') {

			sIndex++
			pIndex++

		// Found '*'
		} else if pIndex < len(p) && p[pIndex] == '*' {

			starIndex = pIndex
			matchIndex = sIndex
			pIndex++

		// Previous '*' absorbs one more character
		} else if starIndex != -1 {

			pIndex = starIndex + 1
			matchIndex++
			sIndex = matchIndex

		// No possible match
		} else {
			return false
		}
	}

	// Remaining pattern characters must be '*'
	for pIndex < len(p) && p[pIndex] == '*' {
		pIndex++
	}

	return pIndex == len(p)
}

The Go implementation mirrors the Python logic very closely.

Go strings are byte indexed, which is safe here because the problem guarantees lowercase English letters and wildcard symbols only. No Unicode handling is necessary.

The implementation uses integer indices directly rather than slices or recursion, which keeps memory usage constant.

Worked Examples

Example 1

s = "aa"
p = "a"
Step s_index p_index s char p char Action
1 0 0 a a Match, advance both
2 1 1 a end Mismatch, no *

Result: False

The pattern finishes before the string does.

Example 2

s = "aa"
p = "*"
Step s_index p_index Action
1 0 0 Found *, store positions
2 0 1 Pattern ended, backtrack to *
3 1 1 * absorbs one character
4 2 1 * absorbs second character

Remaining pattern is empty.

Result: True

Example 3

s = "cb"
p = "?a"
Step s_index p_index s char p char Action
1 0 0 c ? Match
2 1 1 b a Mismatch

No previous '*' exists.

Result: False

Additional Example

s = "adceb"
p = "*a*b"
Step s_index p_index Action
1 0 0 Store first *
2 0 1 Match a
3 1 2 Store second *
4 1 3 Mismatch with b, expand *
5 2 3 Mismatch, expand *
6 3 3 Mismatch, expand *
7 4 3 Match b
8 5 4 Both consumed

Result: True

Complexity Analysis

Measure Complexity Explanation
Time O(n + m) Each pointer moves forward at most linearly
Space O(1) Only a few integer variables are used

The greedy algorithm avoids recursive branching entirely.

Although backtracking occurs conceptually, the string pointer never resets arbitrarily. Each character is revisited only in relation to the most recent '*', which guarantees linear total work.

The implementation uses constant auxiliary space because no DP table or recursion stack is needed.

Test Cases

solution = Solution()

# Provided examples
assert solution.isMatch("aa", "a") is False  # Pattern too short
assert solution.isMatch("aa", "*") is True  # Star matches everything
assert solution.isMatch("cb", "?a") is False  # Final character mismatch

# Empty string cases
assert solution.isMatch("", "") is True  # Both empty
assert solution.isMatch("", "*") is True  # Star matches empty
assert solution.isMatch("", "?") is False  # '?' requires one character

# Exact matching
assert solution.isMatch("abc", "abc") is True  # Perfect match
assert solution.isMatch("abc", "abd") is False  # One character differs

# Question mark handling
assert solution.isMatch("abc", "a?c") is True  # '?' matches one character
assert solution.isMatch("abc", "???" ) is True  # All wildcard single matches
assert solution.isMatch("abc", "??") is False  # Pattern too short

# Star handling
assert solution.isMatch("abc", "*") is True  # Full wildcard
assert solution.isMatch("abc", "a*") is True  # Star suffix
assert solution.isMatch("abc", "*c") is True  # Star prefix
assert solution.isMatch("abc", "a*c") is True  # Star middle
assert solution.isMatch("abc", "a*d") is False  # Incorrect ending

# Multiple stars
assert solution.isMatch("abcdef", "**a**f**") is True  # Consecutive stars
assert solution.isMatch("abcdef", "*b*d*f") is True  # Multiple expansions

# Complex backtracking cases
assert solution.isMatch("adceb", "*a*b") is True  # Classic example
assert solution.isMatch("acdcb", "a*c?b") is False  # Requires careful backtracking

# Boundary values
assert solution.isMatch("a" * 2000, "*") is True  # Maximum string length
assert solution.isMatch("a" * 2000, "a" * 2000) is True  # Large exact match
assert solution.isMatch("a" * 1999 + "b", "a" * 2000) is False  # Large mismatch
Test Why
("", "") Validates empty input handling
("", "*") Ensures '*' can match empty
("", "?") Confirms '?' requires one character
("abc", "a?c") Tests single-character wildcard
("abc", "a*c") Tests middle star expansion
("abcdef", "**a**f**") Validates consecutive stars
("adceb", "*a*b") Tests greedy backtracking
("acdcb", "a*c?b") Complex mismatch scenario
Large 2000-character cases Verifies scalability

Edge Cases

One important edge case is an empty string with a non-empty pattern. Many implementations incorrectly assume that any wildcard pattern can match an empty string. In reality, only patterns consisting entirely of '*' can do so. The implementation handles this correctly by skipping remaining stars at the end and verifying that no non-star characters remain.

Another tricky case involves multiple consecutive '*' characters such as "**a***b*". Naive recursive solutions may repeatedly explore equivalent states. The greedy implementation naturally handles this because every '*' is processed identically, and redundant stars do not create extra complexity.

A third important edge case is when a mismatch occurs after a previous successful star expansion. For example, "adceb" with "*a*b" requires the algorithm to revisit the earlier star and extend its coverage several times. Many incorrect solutions fail because they greedily consume too many characters too early and cannot recover. This implementation explicitly stores the most recent star position and incrementally expands it only when needed, guaranteeing correctness.

A final subtle edge case occurs when the pattern finishes before the string. For example, "abc" and "ab". Some implementations mistakenly return true after partial matching. This algorithm avoids that bug because the main loop continues until the entire string is consumed, and any unmatched remaining characters force a failure unless a previous '*' can absorb them.