LeetCode 3292 - Minimum Number of Valid Strings to Form Target II
We are given a collection of strings words and a target string target. A string is considered valid if it is a prefix of at least one word in words. This means that for every word, all of its prefixes are available for use.
Difficulty: 🔴 Hard
Topics: Array, String, Binary Search, Dynamic Programming, Greedy, Segment Tree, Rolling Hash, String Matching, Hash Function
Solution
LeetCode 3292 - Minimum Number of Valid Strings to Form Target II
Problem Understanding
We are given a collection of strings words and a target string target.
A string is considered valid if it is a prefix of at least one word in words. This means that for every word, all of its prefixes are available for use. For example, if a word is "abcdef", then "a", "ab", "abc", "abcd", "abcde", and "abcdef" are all valid strings.
Our goal is to construct target by concatenating valid strings. Among all possible ways to do so, we must return the minimum number of valid strings used.
If no sequence of valid strings can produce the target exactly, we return -1.
The important observation is that we are not limited to using complete words. Any prefix of any word may be selected, and prefixes may be reused any number of times.
Input Interpretation
words defines a large set of valid strings:
Valid strings =
all prefixes of words[0]
∪ all prefixes of words[1]
∪ ...
For every position in target, we want to know how far we can extend using one valid string starting at that position.
After we know these maximal extensions, the problem becomes:
Cover the entire target interval [0, n)
using the minimum number of jumps.
Constraint Analysis
The constraints are large:
sum(words[i].length) <= 100000target.length <= 50000
A naive comparison of every prefix against every target position would be far too slow.
The solution must be close to linear or near-linear.
Important Edge Cases
A target may begin with a character that is not the first character of any word. In that case the answer is immediately -1.
Some positions may be reachable while others are not. Even if the beginning can be formed, there may be a gap later that makes the entire target impossible.
Many words may share long common prefixes. A solution that explicitly generates all prefixes would require too much memory.
The optimal solution must efficiently determine the longest valid prefix match at every target position.
Approaches
Brute Force
A straightforward approach is dynamic programming.
Let:
dp[i] = minimum number of valid strings needed
to build target[0:i]
For every position i, we try every valid string and check whether it matches starting at i.
If it matches and has length L, then:
dp[i + L] = min(dp[i + L], dp[i] + 1)
This is correct because it explores all possible decompositions.
The problem is that the number of valid strings is enormous. A word of length 50000 contributes 50000 prefixes. Explicitly generating and matching all prefixes is infeasible.
Key Insight
Instead of enumerating all valid strings, we only need to know:
At position i,
what is the maximum length of a valid string
that matches target[i:]?
Suppose this maximum length is:
reach[i]
Then from position i, we may jump to any position in:
(i + 1) ... (i + reach[i])
This transforms the problem into a classic interval covering problem.
To compute reach[i] efficiently:
- Use rolling hash.
- For every word length, store hashes of all prefixes.
- For each target position, binary search the longest matching prefix length.
- Convert the resulting intervals into a minimum-jump problem.
The minimum-jump problem is identical to the greedy solution for Jump Game II.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | Exponential or worse than O(n²) | O(n) | Tries huge numbers of prefixes |
| Optimal | O((n + Σ | words | ) log M) |
Where:
n = len(target)M = max word length
Algorithm Walkthrough
Step 1: Build rolling-hash powers
Precompute powers of a hash base.
This allows constant-time substring hash extraction later.
Step 2: Store hashes of every word prefix
For each word:
- Compute its prefix hashes.
- For every prefix length
k,
store the hash of that prefix in a set associated with length k.
Conceptually:
prefixHashes[k]
=
all hashes of prefixes of length k
from every word
Now we can quickly ask:
Does some word have a prefix of length k
equal to this target substring?
Step 3: Build rolling hash for target
Compute prefix hashes of the target string.
This enables:
hash(target[l:r])
in O(1).
Step 4: Compute maximum reach for every position
For each starting position i:
Binary search the largest length L such that:
target[i:i+L]
has a hash contained in:
prefixHashes[L]
Because every shorter prefix must also be valid whenever a longer prefix is valid, the predicate is monotonic and binary search works.
Store:
reach[i] = L
Step 5: Convert into intervals
Position i can reach:
[i + 1, i + reach[i]]
This becomes an interval:
[i, i + reach[i]]
Step 6: Greedy minimum cover
We now need the minimum number of intervals that cover:
[0, n]
Maintain:
currentEndfarthestanswer
Scan positions from left to right.
Whenever we reach currentEnd, we must choose another interval.
This is exactly the same greedy strategy used in Jump Game II.
Step 7: Detect impossible cases
If at some point:
farthest <= current position
then progress is impossible.
Return:
-1
Why it works
For every position, reach[i] gives the furthest endpoint obtainable with a single valid string beginning there. Therefore the problem becomes selecting the minimum number of intervals needed to advance from the start to the end of the target.
The greedy interval-covering algorithm is optimal because whenever a new segment must be chosen, selecting the interval that extends furthest can never increase the number of future segments required. This is the standard correctness argument for minimum interval cover and Jump Game II.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def minValidStrings(self, words: List[str], target: str) -> int:
BASE = 911382323
MASK = (1 << 64) - 1
max_len = max(len(w) for w in words)
n = len(target)
limit = max(max_len, n)
power = [0] * (limit + 1)
power[0] = 1
for i in range(limit):
power[i + 1] = (power[i] * BASE) & MASK
prefix_hashes = defaultdict(set)
for word in words:
h = 0
for i, ch in enumerate(word):
h = (h * BASE + (ord(ch) - 96)) & MASK
prefix_hashes[i + 1].add(h)
target_hash = [0] * (n + 1)
for i, ch in enumerate(target):
target_hash[i + 1] = (
target_hash[i] * BASE + (ord(ch) - 96)
) & MASK
def get_hash(left: int, right: int) -> int:
return (
target_hash[right]
- target_hash[left] * power[right - left]
) & MASK
reach = [0] * n
for start in range(n):
hi = min(max_len, n - start)
lo = 0
while lo < hi:
mid = (lo + hi + 1) // 2
if get_hash(start, start + mid) in prefix_hashes[mid]:
lo = mid
else:
hi = mid - 1
reach[start] = lo
if n == 0:
return 0
answer = 0
current_end = 0
farthest = 0
for i in range(n):
farthest = max(farthest, i + reach[i])
if i == current_end:
if farthest <= i:
return -1
answer += 1
current_end = farthest
if current_end >= n:
return answer
return -1
Implementation Explanation
The first section constructs rolling-hash powers and stores hashes for every possible prefix of every word. Instead of storing actual strings, only 64-bit hash values are stored, making membership tests very efficient.
The target string receives its own prefix-hash array. Using the standard rolling-hash formula, any substring hash can then be extracted in constant time.
For each starting position in the target, a binary search determines the largest matching valid prefix length. The search uses the fact that if a length L matches some word prefix, then every smaller length also matches because it is itself a prefix of the same word.
The resulting reach array describes the furthest location obtainable using one valid string starting at each position.
Finally, the problem becomes a minimum-jump problem. The greedy scan keeps track of the furthest reachable position among all intervals currently available. Whenever the current interval ends, a new valid string must be chosen. Selecting the furthest possible extension guarantees the minimum number of pieces.
Go Solution
package main
func minValidStrings(words []string, target string) int {
const BASE uint64 = 911382323
maxLen := 0
for _, w := range words {
if len(w) > maxLen {
maxLen = len(w)
}
}
n := len(target)
limit := maxLen
if n > limit {
limit = n
}
power := make([]uint64, limit+1)
power[0] = 1
for i := 0; i < limit; i++ {
power[i+1] = power[i] * BASE
}
prefixHashes := make(map[int]map[uint64]struct{})
for _, word := range words {
var h uint64 = 0
for i := 0; i < len(word); i++ {
h = h*BASE + uint64(word[i]-'a'+1)
if prefixHashes[i+1] == nil {
prefixHashes[i+1] = make(map[uint64]struct{})
}
prefixHashes[i+1][h] = struct{}{}
}
}
targetHash := make([]uint64, n+1)
for i := 0; i < n; i++ {
targetHash[i+1] =
targetHash[i]*BASE + uint64(target[i]-'a'+1)
}
getHash := func(l, r int) uint64 {
return targetHash[r] - targetHash[l]*power[r-l]
}
reach := make([]int, n)
for start := 0; start < n; start++ {
lo := 0
hi := maxLen
if n-start < hi {
hi = n - start
}
for lo < hi {
mid := (lo + hi + 1) / 2
hash := getHash(start, start+mid)
if _, ok := prefixHashes[mid][hash]; ok {
lo = mid
} else {
hi = mid - 1
}
}
reach[start] = lo
}
answer := 0
currentEnd := 0
farthest := 0
for i := 0; i < n; i++ {
if i+reach[i] > farthest {
farthest = i + reach[i]
}
if i == currentEnd {
if farthest <= i {
return -1
}
answer++
currentEnd = farthest
if currentEnd >= n {
return answer
}
}
}
return -1
}
Go-Specific Notes
The Go implementation uses uint64 arithmetic directly. Overflow is intentional and acts as modulo 2^64, which is a common rolling-hash technique.
Hash sets are represented as:
map[uint64]struct{}
which provides O(1) expected lookup time while minimizing memory overhead.
Slices are used throughout, and no special handling for nil is required because all slices are explicitly initialized.
Worked Examples
Example 1
words = ["abc","aaaaa","bcdef"]
target = "aabcdabc"
Valid prefixes include:
a
aa
aaa
aaaa
aaaaa
b
bc
bcd
bcde
bcdef
ab
abc
Computed reach values:
| Position | Suffix | Longest Valid Prefix | Reach |
|---|---|---|---|
| 0 | aabcdabc | aa | 2 |
| 1 | abcdabc | abc | 3 |
| 2 | bcdabc | bcd | 3 |
| 3 | cdabc | none | 0 |
| 4 | dabc | none | 0 |
| 5 | abc | abc | 3 |
| 6 | bc | bc | 2 |
| 7 | c | 0 | 0 |
Greedy process:
| Position | Farthest | Current End | Pieces |
|---|---|---|---|
| 0 | 2 | 0 | choose first piece |
| 1 | 4 | 2 | |
| 2 | 5 | 2 | choose second piece |
| 5 | 8 | 5 | choose third piece |
Result:
3
Example 2
words = ["abababab","ab"]
target = "ababaababa"
The longest valid prefix from position 0 is:
ababa
Length 5.
The longest valid prefix from position 5 is also:
ababa
Length 5.
Therefore:
ababa + ababa
covers the target.
Answer:
2
Example 3
words = ["abcdef"]
target = "xyz"
At position 0:
reach[0] = 0
No valid prefix starts with 'x'.
The greedy scan cannot advance.
Result:
-1
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O((n + Σ | words |
| Space | O(Σ | words |
Where:
n = len(target)M = max(len(word))
The preprocessing phase visits every character in every word once. Each target position performs a binary search over possible prefix lengths, and each hash lookup is O(1) expected time.
Test Cases
s = Solution()
assert s.minValidStrings(["abc", "aaaaa", "bcdef"], "aabcdabc") == 3 # example 1
assert s.minValidStrings(["abababab", "ab"], "ababaababa") == 2 # example 2
assert s.minValidStrings(["abcdef"], "xyz") == -1 # example 3
assert s.minValidStrings(["a"], "a") == 1 # smallest valid case
assert s.minValidStrings(["abc"], "abc") == 1 # entire word used
assert s.minValidStrings(["abc"], "ab") == 1 # prefix only
assert s.minValidStrings(["abc"], "abab") == 2 # repeated reuse
assert s.minValidStrings(["a"], "aaaaaa") == 6 # many small pieces
assert s.minValidStrings(["aaaaaa"], "aaaaaa") == 1 # one large piece
assert s.minValidStrings(["aaaaaa"], "aaaaaaa") == 2 # one full prefix plus one char
assert s.minValidStrings(["abc", "bcd"], "abcd") == 2 # overlap between words
assert s.minValidStrings(["abc"], "bc") == -1 # impossible start
assert s.minValidStrings(["ab", "cd"], "abcd") == 2 # exact concatenation
assert s.minValidStrings(["xyz", "xy"], "xyxyxy") == 3 # repeated prefix usage
assert s.minValidStrings(["a" * 50000], "a" * 50000) == 1 # maximum length style case
Test Summary
| Test | Why |
|---|---|
| Example 1 | Official sample |
| Example 2 | Official sample |
| Example 3 | Official impossible case |
["a"], "a" |
Minimum input size |
["abc"], "abc" |
Entire word used |
["abc"], "ab" |
Prefix usage |
["abc"], "abab" |
Reusing prefixes |
["a"], "aaaaaa" |
Many small segments |
["aaaaaa"], "aaaaaa" |
Single segment solution |
["aaaaaa"], "aaaaaaa" |
Multiple large jumps |
["abc","bcd"], "abcd" |
Cross-word composition |
["abc"], "bc" |
Unreachable start |
["ab","cd"], "abcd" |
Exact decomposition |
["xyz","xy"], "xyxyxy" |
Repeated valid prefixes |
| Large repeated string | Stress test near limits |
Edge Cases
Target Cannot Even Start
Consider:
words = ["abc"]
target = "xyz"
No valid string begins with 'x'. A buggy implementation might continue processing and produce an incorrect result. In this solution, reach[0] becomes 0, the greedy scan detects that no progress can be made, and returns -1.
A Gap Appears After Some Progress
Consider:
words = ["abc"]
target = "abx"
The prefix "ab" is valid, but once position 2 is reached there is no valid continuation. The greedy phase eventually encounters a position where the furthest reachable endpoint does not advance, correctly returning -1.
Extremely Long Words
The constraints allow total word length up to 100000. Explicitly generating and storing all prefix strings would consume significant memory and lead to expensive string comparisons. The implementation stores only rolling-hash values, reducing both memory usage and lookup time while still supporting efficient prefix matching.
Many Overlapping Prefixes
Consider:
words = ["aaaaaa"]
target = "aaaaaaaaaaaa"
Every position has many possible valid prefix lengths. A naive dynamic programming solution would explore many redundant transitions. The binary-search-plus-hash approach directly finds the longest reachable interval at each position, and the greedy cover computes the optimal answer efficiently.