LeetCode 3291 - Minimum Number of Valid Strings to Form Target I

LeetCode Problem 3291

Difficulty: 🟡 Medium
Topics: Array, String, Binary Search, Dynamic Programming, Greedy, Trie, Segment Tree, Rolling Hash, String Matching, Hash Function

Solution

Problem Understanding

We are given a list of strings called words and another string called target. The task is to construct target by concatenating several smaller strings, where each smaller string must be a prefix of at least one word in words.

A prefix means the beginning portion of a word. For example, if a word is "abcdef", then "a", "ab", "abc", and "abcdef" are all valid prefixes.

The important detail is that we are not restricted to using complete words. Any prefix of any word may be used as one piece in the concatenation.

For example:

If words = ["abcdef"]
Then valid strings include:
"a"
"ab"
"abc"
"abcd"
"abcde"
"abcdef"

We want the minimum number of such valid strings whose concatenation exactly equals target.

This is fundamentally a shortest segmentation problem. At every position in target, we want to know which valid prefixes can start there, and then choose the segmentation with the fewest pieces.

The constraints are important:

Total length of all words is at most 10^5
target.length <= 5000

These constraints are too large for naive substring generation and repeated prefix checking. A brute-force recursion would explode exponentially because there can be many overlapping ways to segment the target.

The problem strongly suggests using efficient string matching structures such as:

Trie
Dynamic Programming
Greedy optimization

Several edge cases immediately stand out:

The target may start with a character that no valid prefix can match.
A solution may require many short prefixes instead of a few long ones.
Multiple words may share prefixes.
A greedy choice of the longest current prefix is not always obviously safe unless carefully justified.

The core challenge is efficiently determining how far we can extend from every position in target.

Approaches

Brute Force

A straightforward solution is to try every possible valid prefix at every position.

Suppose we are currently at index i in target. We could:

Generate every substring target[i:j]
Check whether it is a valid prefix of some word
Recursively solve the remainder starting from j
Take the minimum number of pieces

This approach is correct because it explores every possible segmentation.

However, it is extremely inefficient:

There are exponentially many segmentations.
Repeated substring checks are expensive.
Many overlapping subproblems appear repeatedly.

Even with memoization, checking whether every substring is a valid prefix can still become costly.

Key Insight

The crucial observation is that every prefix of every word forms a path inside a trie.

If we build a trie from words, then while scanning target starting from position i, we can walk through the trie character by character.

Every node we visit corresponds to a valid prefix.

This allows us to efficiently discover all valid segments starting at position i.

Then we combine this with dynamic programming:

dp[i] = minimum number of valid strings needed to form target[i:]

For every reachable substring target[i:j] that exists in the trie:

dp[i] = min(dp[i], 1 + dp[j + 1])

This transforms the problem into an efficient shortest-path style DP.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	Exponential	O(n) recursion	Tries every segmentation
Optimal Trie + DP	O(n × L)	O(total word length + n)	Efficient prefix matching with memoization

Here:

n = len(target)
L = maximum word length

Algorithm Walkthrough

Step 1: Build a Trie

We insert every word into a trie.

Each node contains:

Children pointers
No explicit end marker is required

Why?

Because every prefix along the path is automatically valid.

For example, inserting "abcdef" means:

"a" is valid
"ab" is valid
"abc" is valid
and so on

Step 2: Define Dynamic Programming State

We define:

dp[i] = minimum number of valid strings needed to form target[i:]

If i == len(target), then:

dp[i] = 0

because no characters remain.

Step 3: Transition From Position i

Starting from target[i], we walk through the trie.

We continue character by character:

If the current character exists in the trie, we continue.
Otherwise, matching stops.

Every successful extension corresponds to a valid prefix.

If we successfully match up to index j, then:

candidate = 1 + dp[j + 1]

We take the minimum over all candidates.

Step 4: Memoization

Many positions in the target are revisited repeatedly.

We memoize results for each i.

This reduces the complexity from exponential to polynomial.

Step 5: Return Result

If the answer is infinity, return -1.

Otherwise return the computed minimum.

Why it works

The trie guarantees that every traversed path corresponds to a valid prefix of some word.

The DP guarantees optimality because:

every valid first segment is considered
the remainder is solved optimally through recursion
the minimum among all possibilities is selected

Thus, the algorithm explores all valid decompositions while avoiding repeated work.

Python Solution

from typing import List
from functools import lru_cache

class TrieNode:
    def __init__(self):
        self.children = {}

class Solution:
    def minValidStrings(self, words: List[str], target: str) -> int:
        root = TrieNode()

        # Build trie
        for word in words:
            node = root
            for ch in word:
                if ch not in node.children:
                    node.children[ch] = TrieNode()
                node = node.children[ch]

        n = len(target)
        INF = float('inf')

        @lru_cache(None)
        def dp(index: int) -> int:
            if index == n:
                return 0

            node = root
            answer = INF

            for j in range(index, n):
                ch = target[j]

                if ch not in node.children:
                    break

                node = node.children[ch]

                next_result = dp(j + 1)

                if next_result != INF:
                    answer = min(answer, 1 + next_result)

            return answer

        result = dp(0)
        return result if result != INF else -1

The implementation begins by constructing a trie from all words. Each character is inserted into nested trie nodes. Since every prefix is valid, we do not need a terminal marker.

The recursive dp(index) function computes the minimum number of valid strings needed to build the suffix starting at index.

At each position:

We start walking through the trie.
Every successful extension represents a valid prefix.
We recursively solve the remaining suffix.

The recursion uses memoization through lru_cache, ensuring that each target index is computed only once.

If traversal fails because a character is missing from the trie, we stop immediately since no longer substring can match either.

Finally, if the result remains infinite, the target cannot be formed and we return -1.

Go Solution

package main

import "math"

type TrieNode struct {
	children map[byte]*TrieNode
}

func newTrieNode() *TrieNode {
	return &TrieNode{
		children: make(map[byte]*TrieNode),
	}
}

func minValidStrings(words []string, target string) int {
	root := newTrieNode()

	// Build trie
	for _, word := range words {
		node := root

		for i := 0; i < len(word); i++ {
			ch := word[i]

			if _, exists := node.children[ch]; !exists {
				node.children[ch] = newTrieNode()
			}

			node = node.children[ch]
		}
	}

	n := len(target)
	const INF = math.MaxInt32

	memo := make([]int, n+1)

	for i := 0; i <= n; i++ {
		memo[i] = -2
	}

	var dp func(int) int

	dp = func(index int) int {
		if index == n {
			return 0
		}

		if memo[index] != -2 {
			return memo[index]
		}

		node := root
		answer := INF

		for j := index; j < n; j++ {
			ch := target[j]

			nextNode, exists := node.children[ch]
			if !exists {
				break
			}

			node = nextNode

			nextResult := dp(j + 1)

			if nextResult != INF {
				if 1+nextResult < answer {
					answer = 1 + nextResult
				}
			}
		}

		memo[index] = answer
		return answer
	}

	result := dp(0)

	if result == INF {
		return -1
	}

	return result
}

The Go version follows exactly the same logic as the Python implementation.

A few Go-specific details are worth noting:

Go does not provide built-in memoization decorators, so we use a slice manually.
-2 is used as the uncomputed sentinel value.
math.MaxInt32 acts as infinity.
Trie children are stored in map[byte]*TrieNode.

Because strings in this problem contain only lowercase English letters, indexing by byte is safe and efficient.

Worked Examples

Example 1

words = ["abc","aaaaa","bcdef"]
target = "aabcdabc"

Trie Contents

Valid prefixes include:

"a"
"aa"
"aaa"
"aaaa"
"aaaaa"
"b"
"bc"
"bcd"
"bcde"
"bcdef"
"ab"
"abc"

DP Traversal

Position	Possible Valid Prefixes	Best Result
0	"a", "aa"	choose "aa"
2	"b", "bc", "bcd"	choose "bcd"
5	"a", "ab", "abc"	choose "abc"
8	end	0

Final decomposition:

"aa" + "bcd" + "abc"

Answer:

Example 2

words = ["abababab","ab"]
target = "ababaababa"

Valid Prefixes

From "abababab":

"a"
"ab"
"aba"
"abab"
"ababa"
...

Optimal Construction

Segment	Value
First	"ababa"
Second	"ababa"

Total pieces:

Example 3

words = ["abcdef"]
target = "xyz"

At index 0:

'x' does not exist in trie root children.

No valid prefix can start the target.

Answer:

-1

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n × L)	Each DP state scans at most the maximum word length
Space	O(total word length + n)	Trie storage plus memoization

Let:

n = len(target)
L = maximum word length

Each target index is computed once because of memoization. From each index, trie traversal continues only while characters match.

The trie contains at most the total number of characters across all words, which is bounded by 10^5.

Test Cases

sol = Solution()

# Provided examples
assert sol.minValidStrings(["abc","aaaaa","bcdef"], "aabcdabc") == 3  # standard case
assert sol.minValidStrings(["abababab","ab"], "ababaababa") == 2  # overlapping prefixes
assert sol.minValidStrings(["abcdef"], "xyz") == -1  # impossible target

# Single character target
assert sol.minValidStrings(["a"], "a") == 1  # exact single prefix

# Multiple small prefixes
assert sol.minValidStrings(["abc"], "aaa") == 3  # repeated single-char prefix

# Exact full-word usage
assert sol.minValidStrings(["hello"], "hello") == 1  # full word used

# Prefer longer segmentation
assert sol.minValidStrings(["abcdef"], "abcdefabc") == 2  # long prefixes minimize count

# Impossible midway
assert sol.minValidStrings(["abc", "def"], "abx") == -1  # mismatch in middle

# Shared prefixes
assert sol.minValidStrings(["ab", "abc", "abcd"], "abcdabc") == 2  # trie overlap

# Large repeated structure
assert sol.minValidStrings(["aaaaa"], "aaaaaaaaaa") == 2  # repeated long prefix

# Minimum prefix length repeatedly
assert sol.minValidStrings(["xyz"], "xx") == 2  # only single-char prefix available

Test	Why
`["abc","aaaaa","bcdef"], "aabcdabc"`	Standard mixed-prefix example
`["abababab","ab"], "ababaababa"`	Overlapping prefix choices
`["abcdef"], "xyz"`	Impossible construction
`["a"], "a"`	Smallest valid input
`["abc"], "aaa"`	Repeated shortest prefix
`["hello"], "hello"`	Entire word used directly
`["abcdef"], "abcdefabc"`	Optimal long-prefix selection
`["abc","def"], "abx"`	Failure after partial progress
`["ab","abc","abcd"], "abcdabc"`	Shared trie paths
`["aaaaa"], "aaaaaaaaaa"`	Repeated large prefix reuse
`["xyz"], "xx"`	Only one-character prefix usable

Edge Cases

One important edge case occurs when the target begins with a character that does not exist in any word prefix. For example:

words = ["abc"]
target = "xabc"

A naive implementation might continue unnecessary recursion or substring generation. In the trie solution, failure happens immediately at the root because 'x' is not a child node. The algorithm correctly returns -1.

Another important edge case is when only very short prefixes are usable. Consider:

words = ["abc"]
target = "aaaa"

The only valid prefix available is "a". The algorithm must repeatedly use single-character prefixes. The trie traversal naturally supports this because every prefix node is considered valid during traversal.

A third tricky case involves heavily overlapping prefixes:

words = ["ab", "abc", "abcd"]

Many paths share the same initial characters. A naive solution might repeatedly recheck identical substrings. The trie efficiently compresses shared prefixes into common nodes, reducing redundant work and ensuring efficient traversal.

A final edge case is when choosing the longest current prefix is not always obviously sufficient. The DP guarantees correctness because every valid continuation is explored and memoized. Even if a shorter prefix eventually leads to a better global solution, the algorithm still finds it.