LeetCode 1397 - Find All Good Strings

The problem asks us to count how many strings of length n satisfy three conditions simultaneously: 1. The string must be lexicographically greater than or equal to s1 2. The string must be lexicographically less than or equal to s2 3.

LeetCode Problem 1397

Difficulty: 🔴 Hard
Topics: String, Dynamic Programming, String Matching

Solution

Problem Understanding

The problem asks us to count how many strings of length n satisfy three conditions simultaneously:

  1. The string must be lexicographically greater than or equal to s1
  2. The string must be lexicographically less than or equal to s2
  3. The string must not contain evil as a substring anywhere

The result can become extremely large, so we return it modulo 10^9 + 7.

Lexicographical comparison works the same way dictionary ordering works. For example:

  • "abc" < "abd"
  • "baa" > "azz"

The input consists of:

  • n, the required length of every generated string
  • s1, the lower lexicographical bound
  • s2, the upper lexicographical bound
  • evil, a forbidden substring

We must count every valid string between s1 and s2 inclusive that never contains evil.

The constraints are the most important clue for choosing the correct algorithm:

  • n can be as large as 500
  • evil.length can be up to 50

A brute-force enumeration of all possible strings is impossible because there are 26^500 possible strings in the worst case. Even generating all strings between s1 and s2 is astronomically large.

The problem combines several advanced ideas:

  • Digit DP style boundary constraints
  • String matching
  • Dynamic programming with memoization
  • KMP prefix function for efficient substring tracking

Several edge cases are important:

  • evil may appear immediately at the beginning
  • evil may overlap with itself, requiring careful prefix handling
  • s1 and s2 may be extremely close
  • Every possible string in the range may be invalid
  • evil may have length 1, making filtering especially restrictive

The problem guarantees:

  • s1 <= s2
  • All strings contain lowercase English letters only
  • Both bounds already have length n

This allows us to focus entirely on constrained string generation.

Approaches

Brute Force Approach

The brute-force solution would generate every string between s1 and s2, then check whether each string contains evil.

To check containment, we could use normal substring search. Since every generated string has length n, checking one string costs O(n).

However, the number of candidate strings is enormous. In the worst case there are 26^n possibilities.

For n = 500, brute force is completely infeasible.

Even if we optimized substring checking with KMP, the generation itself remains impossible.

Key Insight

The key observation is that we never need to construct all strings explicitly.

Instead, we can build strings character by character using dynamic programming.

At every position, the future possibilities depend only on:

  • The current index
  • Whether we are still tied to the lower bound
  • Whether we are still tied to the upper bound
  • How much of evil has already been matched

The last component is critical.

Suppose evil = "abab".

If the current suffix of our generated string is "aba", then adding 'b' would complete the forbidden substring. We therefore need efficient tracking of partial matches.

This is exactly what the KMP prefix function provides.

We combine:

  • Digit DP for lexicographical constraints
  • KMP automaton for forbidden substring tracking

This creates a compact state space that is efficient enough for the constraints.

Approach Time Complexity Space Complexity Notes
Brute Force O(26^n × n) O(n) Generates all candidate strings and checks each for evil
Optimal O(n × m × 26 × 4) O(n × m × 4) DP with KMP automaton and lexicographical bounds

Here:

  • n = len(s1)
  • m = len(evil)

Algorithm Walkthrough

Step 1: Build the KMP Prefix Table

We first compute the KMP longest prefix suffix array for evil.

For every position i, lps[i] stores the length of the longest proper prefix of evil[:i+1] that is also a suffix.

This allows us to efficiently transition between partial matches when adding new characters.

For example, if:

evil = "abab"

Then:

lps = [0, 0, 1, 2]

This table prevents us from restarting matching from scratch every time a mismatch occurs.

Step 2: Define DP State

We define a memoized DFS:

dfs(position, matched, tightLow, tightHigh)

Where:

  • position is the current index in the string
  • matched is how many characters of evil currently match
  • tightLow indicates whether we still match the prefix of s1
  • tightHigh indicates whether we still match the prefix of s2

This state fully determines all future choices.

Step 3: Handle Forbidden State

If:

matched == len(evil)

then we already formed evil.

This path is invalid and contributes 0.

Step 4: Handle Completed String

If:

position == n

then we successfully built a valid string without matching evil.

Return 1.

Step 5: Determine Allowed Character Range

At each position, the valid character range depends on boundary constraints.

If tightLow is true, the minimum allowed character is:

s1[position]

Otherwise it is 'a'.

Similarly:

  • if tightHigh is true, maximum is s2[position]
  • otherwise maximum is 'z'

Step 6: Try Every Possible Character

For each candidate character:

  1. Update KMP matching state
  2. Update lower-bound tightness
  3. Update upper-bound tightness
  4. Recurse

The KMP transition works like this:

  • While mismatch exists, fall back using lps
  • If current character matches, increase matched length

This efficiently maintains substring matching information.

Step 7: Memoize Results

Many states repeat.

Memoization ensures each state is computed once.

Without memoization, recursion would still explode exponentially.

Why it works

The algorithm works because the DP state captures every piece of information needed to determine future valid strings.

The KMP state guarantees we always know whether adding future characters can complete evil. The tight boundary flags guarantee every generated string remains within [s1, s2].

Since every recursive call explores all valid next characters exactly once, and invalid states are discarded immediately, the algorithm counts precisely all good strings.

Python Solution

from functools import lru_cache
from typing import List

class Solution:
    def findGoodStrings(self, n: int, s1: str, s2: str, evil: str) -> int:
        MOD = 10**9 + 7
        m = len(evil)

        # Build KMP LPS array
        lps = [0] * m
        length = 0

        for i in range(1, m):
            while length > 0 and evil[i] != evil[length]:
                length = lps[length - 1]

            if evil[i] == evil[length]:
                length += 1
                lps[i] = length

        @lru_cache(None)
        def dfs(pos: int, matched: int, tight_low: bool, tight_high: bool) -> int:
            # Found forbidden substring
            if matched == m:
                return 0

            # Successfully built valid string
            if pos == n:
                return 1

            low_char = s1[pos] if tight_low else 'a'
            high_char = s2[pos] if tight_high else 'z'

            total = 0

            for code in range(ord(low_char), ord(high_char) + 1):
                ch = chr(code)

                # KMP transition
                next_match = matched

                while next_match > 0 and evil[next_match] != ch:
                    next_match = lps[next_match - 1]

                if evil[next_match] == ch:
                    next_match += 1

                next_tight_low = tight_low and (ch == low_char)
                next_tight_high = tight_high and (ch == high_char)

                total += dfs(
                    pos + 1,
                    next_match,
                    next_tight_low,
                    next_tight_high
                )

                total %= MOD

            return total

        return dfs(0, 0, True, True)

The implementation starts by constructing the KMP prefix table for evil. This enables efficient substring tracking during DP transitions.

The recursive function dfs represents the main dynamic programming logic.

The parameters capture:

  • current index
  • current partial match length
  • lower bound status
  • upper bound status

At each position, the code computes the allowed character range according to the boundary constraints.

For every candidate character, the KMP automaton updates the matched prefix length. If a full match of evil is reached, that recursive path immediately becomes invalid.

Memoization via lru_cache ensures repeated states are computed only once.

Modulo arithmetic is applied after every addition to prevent overflow.

Go Solution

package main

func findGoodStrings(n int, s1 string, s2 string, evil string) int {
	const MOD int = 1_000_000_007

	m := len(evil)

	// Build KMP LPS array
	lps := make([]int, m)

	length := 0
	for i := 1; i < m; i++ {
		for length > 0 && evil[i] != evil[length] {
			length = lps[length-1]
		}

		if evil[i] == evil[length] {
			length++
			lps[i] = length
		}
	}

	type State struct {
		pos        int
		matched    int
		tightLow   bool
		tightHigh  bool
	}

	memo := make(map[State]int)

	var dfs func(int, int, bool, bool) int

	dfs = func(pos int, matched int, tightLow bool, tightHigh bool) int {
		if matched == m {
			return 0
		}

		if pos == n {
			return 1
		}

		state := State{pos, matched, tightLow, tightHigh}

		if val, ok := memo[state]; ok {
			return val
		}

		lowChar := byte('a')
		highChar := byte('z')

		if tightLow {
			lowChar = s1[pos]
		}

		if tightHigh {
			highChar = s2[pos]
		}

		total := 0

		for ch := lowChar; ch <= highChar; ch++ {
			nextMatch := matched

			for nextMatch > 0 && evil[nextMatch] != ch {
				nextMatch = lps[nextMatch-1]
			}

			if evil[nextMatch] == ch {
				nextMatch++
			}

			nextTightLow := tightLow && (ch == lowChar)
			nextTightHigh := tightHigh && (ch == highChar)

			total += dfs(
				pos+1,
				nextMatch,
				nextTightLow,
				nextTightHigh,
			)

			total %= MOD
		}

		memo[state] = total
		return total
	}

	return dfs(0, 0, true, true)
}

The Go implementation follows the same logic as the Python version but uses an explicit map[State]int for memoization.

Since Go does not have built-in memoized recursion decorators like Python's lru_cache, we define a custom State struct as the map key.

Boolean values are included directly in the memoization state.

Modulo arithmetic is especially important in Go because integer overflow is easier to encounter during accumulation.

Worked Examples

Example 1

Input:
n = 2
s1 = "aa"
s2 = "da"
evil = "b"

Since evil = "b", any string containing 'b' is invalid.

KMP Table

Index Character LPS
0 b 0

Initial State

dfs(0, 0, True, True)

At position 0:

  • lower bound = 'a'
  • upper bound = 'd'

Possible choices:

Character Valid? Reason
a Yes does not match evil
b No immediately forms evil
c Yes safe
d Yes safe

Branch: "a"

Next state:

dfs(1, 0, True, False)

At position 1:

  • lower bound = 'a'
  • upper bound = 'z'

All letters except 'b' are valid.

Count:

25

Branch: "c"

Again 25 valid second characters.

Branch: "d"

Because upper bound remains tight:

dfs(1, 0, False, True)

Allowed range is only 'a'.

Count:

1

Total:

25 + 25 + 1 = 51

Example 2

Input:
n = 8
s1 = "leetcode"
s2 = "leetgoes"
evil = "leet"

Every valid string between the bounds must begin with "leet".

As soon as the DP matches all 4 characters of evil, it returns 0.

Therefore no valid string exists.

Output:

0

Example 3

Input:
n = 2
s1 = "gx"
s2 = "gz"
evil = "x"

Candidate strings:

String Contains x? Valid
gx Yes No
gy No Yes
gz No Yes

Answer:

2

Complexity Analysis

Measure Complexity Explanation
Time O(n × m × 26 × 4) DP states multiplied by alphabet transitions
Space O(n × m × 4) Memoization table plus recursion stack

The DP state consists of:

  • n positions
  • m KMP match states
  • 2 possibilities for lower tightness
  • 2 possibilities for upper tightness

Each state tries at most 26 characters.

Since m <= 50, the solution is efficient enough even for n = 500.

Test Cases

sol = Solution()

# Provided examples
assert sol.findGoodStrings(2, "aa", "da", "b") == 51  # sample case
assert sol.findGoodStrings(8, "leetcode", "leetgoes", "leet") == 0  # all invalid
assert sol.findGoodStrings(2, "gx", "gz", "x") == 2  # small range

# Single character bounds
assert sol.findGoodStrings(1, "a", "z", "a") == 25  # exclude one letter
assert sol.findGoodStrings(1, "a", "a", "b") == 1  # exact valid string
assert sol.findGoodStrings(1, "a", "a", "a") == 0  # exact invalid string

# Evil longer than possible appearance
assert sol.findGoodStrings(2, "aa", "zz", "abc") == 26 * 26  # impossible to contain evil

# Tight range
assert sol.findGoodStrings(2, "ab", "ab", "c") == 1  # exact valid
assert sol.findGoodStrings(2, "ab", "ab", "ab") == 0  # exact invalid

# Overlapping evil pattern
assert sol.findGoodStrings(4, "aaaa", "zzzz", "aa") >= 0  # overlapping forbidden substring

# Evil length 1
assert sol.findGoodStrings(3, "aaa", "zzz", "z") >= 0  # many exclusions

# Entire alphabet range
assert sol.findGoodStrings(2, "aa", "zz", "zz") >= 0  # forbidden suffix case

print("All tests passed.")
Test Why
(2, "aa", "da", "b") Validates standard DP counting
(8, "leetcode", "leetgoes", "leet") Tests immediate forbidden prefix
(2, "gx", "gz", "x") Tests narrow range filtering
(1, "a", "z", "a") Tests single-character evil
(1, "a", "a", "b") Tests exact valid boundary
(1, "a", "a", "a") Tests exact invalid boundary
evil longer than n Ensures impossible matches are handled
tight equal bounds Tests fully constrained generation
overlapping evil Validates KMP fallback correctness
evil = "z" Tests many forbidden states
evil = "zz" Tests forbidden suffix handling

Edge Cases

Evil Appears as a Prefix

Consider:

s1 = "leetcode"
s2 = "leetgoes"
evil = "leet"

Every valid candidate starts with "leet".

A naive solution might continue exploring deeper recursion before detecting invalidity. The KMP state immediately reaches a full match after processing the fourth character, causing the DP to terminate that branch early.

This prevents wasted computation.

Overlapping Forbidden Patterns

Consider:

evil = "abab"

Suppose the current generated suffix is:

"aba"

If the next character is 'b', the forbidden substring appears.

However, after partial mismatches, we cannot simply reset matching to zero because overlaps matter.

The KMP prefix table correctly transitions between partial matches, ensuring overlapping structures are handled efficiently and correctly.

Very Tight Bounds

Consider:

s1 = "abc"
s2 = "abc"

There is only one possible string.

A common bug is incorrectly relaxing boundary constraints too early.

The tightLow and tightHigh flags guarantee that if the current prefix still matches the bounds, future characters remain constrained appropriately.

This ensures exact-range cases work correctly.