LeetCode 3292 - Minimum Number of Valid Strings to Form Target II

We are given a collection of strings words and a target string target. A string is considered valid if it is a prefix of at least one word in words. This means that for every word, all of its prefixes are available for use.

LeetCode Problem 3292

Difficulty: 🔴 Hard
Topics: Array, String, Binary Search, Dynamic Programming, Greedy, Segment Tree, Rolling Hash, String Matching, Hash Function

Solution

LeetCode 3292 - Minimum Number of Valid Strings to Form Target II

Problem Understanding

We are given a collection of strings words and a target string target.

A string is considered valid if it is a prefix of at least one word in words. This means that for every word, all of its prefixes are available for use. For example, if a word is "abcdef", then "a", "ab", "abc", "abcd", "abcde", and "abcdef" are all valid strings.

Our goal is to construct target by concatenating valid strings. Among all possible ways to do so, we must return the minimum number of valid strings used.

If no sequence of valid strings can produce the target exactly, we return -1.

The important observation is that we are not limited to using complete words. Any prefix of any word may be selected, and prefixes may be reused any number of times.

Input Interpretation

words defines a large set of valid strings:

Valid strings =
all prefixes of words[0]
∪ all prefixes of words[1]
∪ ...

For every position in target, we want to know how far we can extend using one valid string starting at that position.

After we know these maximal extensions, the problem becomes:

Cover the entire target interval [0, n)
using the minimum number of jumps.

Constraint Analysis

The constraints are large:

  • sum(words[i].length) <= 100000
  • target.length <= 50000

A naive comparison of every prefix against every target position would be far too slow.

The solution must be close to linear or near-linear.

Important Edge Cases

A target may begin with a character that is not the first character of any word. In that case the answer is immediately -1.

Some positions may be reachable while others are not. Even if the beginning can be formed, there may be a gap later that makes the entire target impossible.

Many words may share long common prefixes. A solution that explicitly generates all prefixes would require too much memory.

The optimal solution must efficiently determine the longest valid prefix match at every target position.

Approaches

Brute Force

A straightforward approach is dynamic programming.

Let:

dp[i] = minimum number of valid strings needed
to build target[0:i]

For every position i, we try every valid string and check whether it matches starting at i.

If it matches and has length L, then:

dp[i + L] = min(dp[i + L], dp[i] + 1)

This is correct because it explores all possible decompositions.

The problem is that the number of valid strings is enormous. A word of length 50000 contributes 50000 prefixes. Explicitly generating and matching all prefixes is infeasible.

Key Insight

Instead of enumerating all valid strings, we only need to know:

At position i,
what is the maximum length of a valid string
that matches target[i:]?

Suppose this maximum length is:

reach[i]

Then from position i, we may jump to any position in:

(i + 1) ... (i + reach[i])

This transforms the problem into a classic interval covering problem.

To compute reach[i] efficiently:

  1. Use rolling hash.
  2. For every word length, store hashes of all prefixes.
  3. For each target position, binary search the longest matching prefix length.
  4. Convert the resulting intervals into a minimum-jump problem.

The minimum-jump problem is identical to the greedy solution for Jump Game II.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force Exponential or worse than O(n²) O(n) Tries huge numbers of prefixes
Optimal O((n + Σ words ) log M)

Where:

  • n = len(target)
  • M = max word length

Algorithm Walkthrough

Step 1: Build rolling-hash powers

Precompute powers of a hash base.

This allows constant-time substring hash extraction later.

Step 2: Store hashes of every word prefix

For each word:

  • Compute its prefix hashes.
  • For every prefix length k,

store the hash of that prefix in a set associated with length k.

Conceptually:

prefixHashes[k]
=
all hashes of prefixes of length k
from every word

Now we can quickly ask:

Does some word have a prefix of length k
equal to this target substring?

Step 3: Build rolling hash for target

Compute prefix hashes of the target string.

This enables:

hash(target[l:r])

in O(1).

Step 4: Compute maximum reach for every position

For each starting position i:

Binary search the largest length L such that:

target[i:i+L]

has a hash contained in:

prefixHashes[L]

Because every shorter prefix must also be valid whenever a longer prefix is valid, the predicate is monotonic and binary search works.

Store:

reach[i] = L

Step 5: Convert into intervals

Position i can reach:

[i + 1, i + reach[i]]

This becomes an interval:

[i, i + reach[i]]

Step 6: Greedy minimum cover

We now need the minimum number of intervals that cover:

[0, n]

Maintain:

  • currentEnd
  • farthest
  • answer

Scan positions from left to right.

Whenever we reach currentEnd, we must choose another interval.

This is exactly the same greedy strategy used in Jump Game II.

Step 7: Detect impossible cases

If at some point:

farthest <= current position

then progress is impossible.

Return:

-1

Why it works

For every position, reach[i] gives the furthest endpoint obtainable with a single valid string beginning there. Therefore the problem becomes selecting the minimum number of intervals needed to advance from the start to the end of the target.

The greedy interval-covering algorithm is optimal because whenever a new segment must be chosen, selecting the interval that extends furthest can never increase the number of future segments required. This is the standard correctness argument for minimum interval cover and Jump Game II.

Python Solution

from typing import List
from collections import defaultdict

class Solution:
    def minValidStrings(self, words: List[str], target: str) -> int:
        BASE = 911382323
        MASK = (1 << 64) - 1

        max_len = max(len(w) for w in words)
        n = len(target)

        limit = max(max_len, n)

        power = [0] * (limit + 1)
        power[0] = 1

        for i in range(limit):
            power[i + 1] = (power[i] * BASE) & MASK

        prefix_hashes = defaultdict(set)

        for word in words:
            h = 0
            for i, ch in enumerate(word):
                h = (h * BASE + (ord(ch) - 96)) & MASK
                prefix_hashes[i + 1].add(h)

        target_hash = [0] * (n + 1)

        for i, ch in enumerate(target):
            target_hash[i + 1] = (
                target_hash[i] * BASE + (ord(ch) - 96)
            ) & MASK

        def get_hash(left: int, right: int) -> int:
            return (
                target_hash[right]
                - target_hash[left] * power[right - left]
            ) & MASK

        reach = [0] * n

        for start in range(n):
            hi = min(max_len, n - start)
            lo = 0

            while lo < hi:
                mid = (lo + hi + 1) // 2

                if get_hash(start, start + mid) in prefix_hashes[mid]:
                    lo = mid
                else:
                    hi = mid - 1

            reach[start] = lo

        if n == 0:
            return 0

        answer = 0
        current_end = 0
        farthest = 0

        for i in range(n):
            farthest = max(farthest, i + reach[i])

            if i == current_end:
                if farthest <= i:
                    return -1

                answer += 1
                current_end = farthest

                if current_end >= n:
                    return answer

        return -1

Implementation Explanation

The first section constructs rolling-hash powers and stores hashes for every possible prefix of every word. Instead of storing actual strings, only 64-bit hash values are stored, making membership tests very efficient.

The target string receives its own prefix-hash array. Using the standard rolling-hash formula, any substring hash can then be extracted in constant time.

For each starting position in the target, a binary search determines the largest matching valid prefix length. The search uses the fact that if a length L matches some word prefix, then every smaller length also matches because it is itself a prefix of the same word.

The resulting reach array describes the furthest location obtainable using one valid string starting at each position.

Finally, the problem becomes a minimum-jump problem. The greedy scan keeps track of the furthest reachable position among all intervals currently available. Whenever the current interval ends, a new valid string must be chosen. Selecting the furthest possible extension guarantees the minimum number of pieces.

Go Solution

package main

func minValidStrings(words []string, target string) int {
	const BASE uint64 = 911382323

	maxLen := 0
	for _, w := range words {
		if len(w) > maxLen {
			maxLen = len(w)
		}
	}

	n := len(target)

	limit := maxLen
	if n > limit {
		limit = n
	}

	power := make([]uint64, limit+1)
	power[0] = 1

	for i := 0; i < limit; i++ {
		power[i+1] = power[i] * BASE
	}

	prefixHashes := make(map[int]map[uint64]struct{})

	for _, word := range words {
		var h uint64 = 0

		for i := 0; i < len(word); i++ {
			h = h*BASE + uint64(word[i]-'a'+1)

			if prefixHashes[i+1] == nil {
				prefixHashes[i+1] = make(map[uint64]struct{})
			}

			prefixHashes[i+1][h] = struct{}{}
		}
	}

	targetHash := make([]uint64, n+1)

	for i := 0; i < n; i++ {
		targetHash[i+1] =
			targetHash[i]*BASE + uint64(target[i]-'a'+1)
	}

	getHash := func(l, r int) uint64 {
		return targetHash[r] - targetHash[l]*power[r-l]
	}

	reach := make([]int, n)

	for start := 0; start < n; start++ {
		lo := 0
		hi := maxLen

		if n-start < hi {
			hi = n - start
		}

		for lo < hi {
			mid := (lo + hi + 1) / 2

			hash := getHash(start, start+mid)

			if _, ok := prefixHashes[mid][hash]; ok {
				lo = mid
			} else {
				hi = mid - 1
			}
		}

		reach[start] = lo
	}

	answer := 0
	currentEnd := 0
	farthest := 0

	for i := 0; i < n; i++ {
		if i+reach[i] > farthest {
			farthest = i + reach[i]
		}

		if i == currentEnd {
			if farthest <= i {
				return -1
			}

			answer++
			currentEnd = farthest

			if currentEnd >= n {
				return answer
			}
		}
	}

	return -1
}

Go-Specific Notes

The Go implementation uses uint64 arithmetic directly. Overflow is intentional and acts as modulo 2^64, which is a common rolling-hash technique.

Hash sets are represented as:

map[uint64]struct{}

which provides O(1) expected lookup time while minimizing memory overhead.

Slices are used throughout, and no special handling for nil is required because all slices are explicitly initialized.

Worked Examples

Example 1

words = ["abc","aaaaa","bcdef"]
target = "aabcdabc"

Valid prefixes include:

a
aa
aaa
aaaa
aaaaa
b
bc
bcd
bcde
bcdef
ab
abc

Computed reach values:

Position Suffix Longest Valid Prefix Reach
0 aabcdabc aa 2
1 abcdabc abc 3
2 bcdabc bcd 3
3 cdabc none 0
4 dabc none 0
5 abc abc 3
6 bc bc 2
7 c 0 0

Greedy process:

Position Farthest Current End Pieces
0 2 0 choose first piece
1 4 2
2 5 2 choose second piece
5 8 5 choose third piece

Result:

3

Example 2

words = ["abababab","ab"]
target = "ababaababa"

The longest valid prefix from position 0 is:

ababa

Length 5.

The longest valid prefix from position 5 is also:

ababa

Length 5.

Therefore:

ababa + ababa

covers the target.

Answer:

2

Example 3

words = ["abcdef"]
target = "xyz"

At position 0:

reach[0] = 0

No valid prefix starts with 'x'.

The greedy scan cannot advance.

Result:

-1

Complexity Analysis

Measure Complexity Explanation
Time O((n + Σ words
Space O(Σ words

Where:

  • n = len(target)
  • M = max(len(word))

The preprocessing phase visits every character in every word once. Each target position performs a binary search over possible prefix lengths, and each hash lookup is O(1) expected time.

Test Cases

s = Solution()

assert s.minValidStrings(["abc", "aaaaa", "bcdef"], "aabcdabc") == 3  # example 1

assert s.minValidStrings(["abababab", "ab"], "ababaababa") == 2  # example 2

assert s.minValidStrings(["abcdef"], "xyz") == -1  # example 3

assert s.minValidStrings(["a"], "a") == 1  # smallest valid case

assert s.minValidStrings(["abc"], "abc") == 1  # entire word used

assert s.minValidStrings(["abc"], "ab") == 1  # prefix only

assert s.minValidStrings(["abc"], "abab") == 2  # repeated reuse

assert s.minValidStrings(["a"], "aaaaaa") == 6  # many small pieces

assert s.minValidStrings(["aaaaaa"], "aaaaaa") == 1  # one large piece

assert s.minValidStrings(["aaaaaa"], "aaaaaaa") == 2  # one full prefix plus one char

assert s.minValidStrings(["abc", "bcd"], "abcd") == 2  # overlap between words

assert s.minValidStrings(["abc"], "bc") == -1  # impossible start

assert s.minValidStrings(["ab", "cd"], "abcd") == 2  # exact concatenation

assert s.minValidStrings(["xyz", "xy"], "xyxyxy") == 3  # repeated prefix usage

assert s.minValidStrings(["a" * 50000], "a" * 50000) == 1  # maximum length style case

Test Summary

Test Why
Example 1 Official sample
Example 2 Official sample
Example 3 Official impossible case
["a"], "a" Minimum input size
["abc"], "abc" Entire word used
["abc"], "ab" Prefix usage
["abc"], "abab" Reusing prefixes
["a"], "aaaaaa" Many small segments
["aaaaaa"], "aaaaaa" Single segment solution
["aaaaaa"], "aaaaaaa" Multiple large jumps
["abc","bcd"], "abcd" Cross-word composition
["abc"], "bc" Unreachable start
["ab","cd"], "abcd" Exact decomposition
["xyz","xy"], "xyxyxy" Repeated valid prefixes
Large repeated string Stress test near limits

Edge Cases

Target Cannot Even Start

Consider:

words = ["abc"]
target = "xyz"

No valid string begins with 'x'. A buggy implementation might continue processing and produce an incorrect result. In this solution, reach[0] becomes 0, the greedy scan detects that no progress can be made, and returns -1.

A Gap Appears After Some Progress

Consider:

words = ["abc"]
target = "abx"

The prefix "ab" is valid, but once position 2 is reached there is no valid continuation. The greedy phase eventually encounters a position where the furthest reachable endpoint does not advance, correctly returning -1.

Extremely Long Words

The constraints allow total word length up to 100000. Explicitly generating and storing all prefix strings would consume significant memory and lead to expensive string comparisons. The implementation stores only rolling-hash values, reducing both memory usage and lookup time while still supporting efficient prefix matching.

Many Overlapping Prefixes

Consider:

words = ["aaaaaa"]
target = "aaaaaaaaaaaa"

Every position has many possible valid prefix lengths. A naive dynamic programming solution would explore many redundant transitions. The binary-search-plus-hash approach directly finds the longest reachable interval at each position, and the greedy cover computes the optimal answer efficiently.