LeetCode 1055 - Shortest Way to Form String
The problem asks us to construct the string target using the fewest possible subsequences of the string source. A subsequence preserves relative order, but characters do not need to be contiguous.
Difficulty: 🟡 Medium
Topics: Two Pointers, String, Binary Search, Greedy
Solution
Problem Understanding
The problem asks us to construct the string target using the fewest possible subsequences of the string source.
A subsequence preserves relative order, but characters do not need to be contiguous. For example, "ace" is a subsequence of "abcde" because we can delete "b" and "d" while keeping the order of the remaining characters.
We are allowed to reuse source multiple times. Each time we reuse it, we may select any subsequence from it. The goal is to determine the minimum number of such subsequences whose concatenation forms the entire target.
For example:
source = "abc"target = "abcbc"
We can form:
"abc"from the first pass throughsource"bc"from the second pass
So the answer is 2.
The important detail is that each subsequence must respect the order of characters in source. We cannot rearrange characters.
The constraints are relatively small:
1 <= source.length, target.length <= 1000- lowercase English letters only
This means even an O(n * m) solution is perfectly acceptable, where:
n = len(source)m = len(target)
However, exponential or heavily recursive brute force solutions would still be impractical.
Several edge cases are important:
- A character exists in
targetbut not insource, in which case the answer is immediately-1. targetmay require many resets of thesourcepointer.sourcemay contain repeated characters.- The optimal solution is greedy, but only if implemented carefully.
The key challenge is determining when we must start a new subsequence and how to maximize the amount of target matched during each pass through source.
Approaches
Brute Force Approach
A brute force solution would attempt to generate all possible subsequences of source and repeatedly try combinations of them to form target.
For a string of length n, there are 2^n possible subsequences. Even with n = 1000, this is astronomically large and completely infeasible.
Another slightly better brute force idea is:
- Start from the beginning of
target - Scan through
source - Match as many characters as possible
- Restart scanning
source - Continue until
targetis fully matched
This already resembles the optimal greedy approach, and importantly, it avoids generating subsequences explicitly.
The reason this greedy method works is that during each pass through source, there is never a benefit to skipping a character that could help match the current target character. Since we want the minimum number of subsequences, each pass should consume as much of target as possible.
Key Insight
The core observation is:
During one traversal of
source, we should greedily match as many characters fromtargetas possible.
If we reach the end of source before fully matching target, we must start another subsequence.
This works because:
- Every subsequence corresponds to one pass through
source - Maximizing matches per pass minimizes the total number of passes
We can implement this efficiently with two pointers:
- One pointer for
source - One pointer for
target
Whenever a character matches, we advance the target pointer.
If an entire pass through source fails to match even one new target character, then the current target character does not exist in source, and the answer is -1.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n) | O(2^n) | Generates subsequences explicitly |
| Optimal Greedy Two Pointers | O(n * m) | O(1) | Greedily consumes target during each pass |
Algorithm Walkthrough
Optimal Greedy Two Pointer Algorithm
- Initialize a pointer
target_indexto0.
This pointer tracks how many characters of target we have already constructed.
2. Initialize a counter subsequence_count to 0.
Every full pass through source represents using one subsequence.
3. While target_index is still inside target, begin a new pass through source.
Increment subsequence_count because we are starting another subsequence.
4. Store the current value of target_index before scanning source.
This helps detect whether we made any progress during this pass.
5. Traverse source from left to right.
For each character:
- If it matches
target[target_index], advancetarget_index - Continue scanning to greedily consume as much of
targetas possible
- After finishing the pass through
source, check whethertarget_indexchanged.
If it did not change, then no character from the current target position exists in source.
In that case, return -1.
7. Repeat until target_index == len(target).
At this point, all target characters have been formed.
8. Return subsequence_count.
Why it works
The greedy strategy is optimal because every subsequence corresponds to one complete left-to-right traversal of source. During a traversal, matching fewer characters can never help reduce the total number of subsequences later. Therefore, greedily consuming as many target characters as possible during each pass always minimizes the number of passes required.
The algorithm also correctly detects impossible cases because if an entire scan through source fails to advance the target pointer, then the needed character does not exist anywhere in source.
Python Solution
class Solution:
def shortestWay(self, source: str, target: str) -> int:
target_index = 0
subsequence_count = 0
while target_index < len(target):
previous_index = target_index
for char in source:
if target_index < len(target) and char == target[target_index]:
target_index += 1
if previous_index == target_index:
return -1
subsequence_count += 1
return subsequence_count
The implementation directly follows the greedy two pointer strategy.
The variable target_index tracks the current character in target that still needs to be matched.
The outer while loop represents repeatedly using subsequences of source. Every iteration corresponds to one complete pass through source.
Inside the loop, we store previous_index before scanning source. This allows us to detect whether any progress was made.
The for char in source loop greedily matches characters in order. Whenever the current source character matches the current target character, we advance target_index.
After the scan:
- If
target_indexdid not change, the current target character does not exist insource, so we return-1 - Otherwise, we successfully formed part of the target, so we increment
subsequence_count
Once target_index reaches the end of target, the entire target string has been constructed.
Go Solution
func shortestWay(source string, target string) int {
targetIndex := 0
subsequenceCount := 0
for targetIndex < len(target) {
previousIndex := targetIndex
for i := 0; i < len(source); i++ {
if targetIndex < len(target) &&
source[i] == target[targetIndex] {
targetIndex++
}
}
if previousIndex == targetIndex {
return -1
}
subsequenceCount++
}
return subsequenceCount
}
The Go implementation mirrors the Python logic almost exactly.
Since Go strings are byte slices internally, indexing with source[i] and target[targetIndex] works efficiently because the input contains only lowercase English letters.
No special handling for Unicode is required.
Go also does not require explicit bounds checking inside the loop because we already guard access with:
targetIndex < len(target)
The space complexity remains constant because only a few integer variables are used.
Worked Examples
Example 1
source = "abc"
target = "abcbc"
Initial state:
| Variable | Value |
|---|---|
| target_index | 0 |
| subsequence_count | 0 |
First Pass Through Source
| Source Char | Target Needed | Match? | target_index After |
|---|---|---|---|
| a | a | Yes | 1 |
| b | b | Yes | 2 |
| c | c | Yes | 3 |
After first pass:
| Variable | Value |
|---|---|
| target_index | 3 |
| subsequence_count | 1 |
Matched "abc".
Second Pass Through Source
Remaining target substring is "bc".
| Source Char | Target Needed | Match? | target_index After |
|---|---|---|---|
| a | b | No | 3 |
| b | b | Yes | 4 |
| c | c | Yes | 5 |
Now target_index == len(target).
Answer:
2
Example 2
source = "abc"
target = "acdbc"
First Pass
| Source Char | Target Needed | Match? | target_index After |
|---|---|---|---|
| a | a | Yes | 1 |
| b | c | No | 1 |
| c | c | Yes | 2 |
Remaining target substring is "dbc".
Second Pass
| Source Char | Target Needed | Match? |
|---|---|---|
| a | d | No |
| b | d | No |
| c | d | No |
No progress was made.
Therefore:
'd'does not exist insource- return
-1
Example 3
source = "xyz"
target = "xzyxz"
First Pass
| Source Char | Target Needed | Match? | target_index After |
|---|---|---|---|
| x | x | Yes | 1 |
| y | z | No | 1 |
| z | z | Yes | 2 |
Matched "xz".
Second Pass
Remaining target substring is "yxz".
| Source Char | Target Needed | Match? | target_index After |
|---|---|---|---|
| x | y | No | 2 |
| y | y | Yes | 3 |
| z | x | No | 3 |
Matched "y".
Third Pass
Remaining target substring is "xz".
| Source Char | Target Needed | Match? | target_index After |
|---|---|---|---|
| x | x | Yes | 4 |
| y | z | No | 4 |
| z | z | Yes | 5 |
Answer:
3
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n * m) | Each pass scans source, and at most m target characters are matched |
| Space | O(1) | Only a few pointer variables are used |
Let:
n = len(source)m = len(target)
In the worst case, each pass through source may match only one target character. Since each pass costs O(n) and there can be up to m passes, the total complexity becomes O(n * m).
The algorithm uses constant extra space because it stores only integer counters and pointers.
Test Cases
solution = Solution()
assert solution.shortestWay("abc", "abcbc") == 2
# Basic example with two subsequences
assert solution.shortestWay("abc", "acdbc") == -1
# Impossible because 'd' does not exist in source
assert solution.shortestWay("xyz", "xzyxz") == 3
# Multiple resets required
assert solution.shortestWay("a", "aaaaa") == 5
# Single character repeated many times
assert solution.shortestWay("abc", "abc") == 1
# Entire target formed in one pass
assert solution.shortestWay("abc", "cba") == 3
# Order mismatch forces separate passes
assert solution.shortestWay("aaaa", "aaaaaaaa") == 2
# Repeated identical characters
assert solution.shortestWay("abc", "d") == -1
# Target character missing entirely
assert solution.shortestWay("abcde", "ace") == 1
# Standard subsequence case
assert solution.shortestWay("abab", "ababaab") == 2
# Repeated pattern matching
assert solution.shortestWay("z", "zzzzzz") == 6
# Smallest source size
assert solution.shortestWay("abc", "cccccc") == 6
# Only one usable character per pass
| Test | Why |
|---|---|
"abc", "abcbc" |
Standard example |
"abc", "acdbc" |
Impossible target |
"xyz", "xzyxz" |
Multiple subsequences required |
"a", "aaaaa" |
Single character repetition |
"abc", "abc" |
One-pass construction |
"abc", "cba" |
Reverse ordering challenge |
"aaaa", "aaaaaaaa" |
Duplicate character handling |
"abc", "d" |
Missing character detection |
"abcde", "ace" |
Simple subsequence |
"abab", "ababaab" |
Repeated pattern matching |
"z", "zzzzzz" |
Minimum source length |
"abc", "cccccc" |
Only one match per scan |
Edge Cases
One important edge case occurs when target contains a character that does not exist in source. For example:
source = "abc"
target = "abd"
A naive implementation might enter an infinite loop repeatedly scanning source without making progress. This implementation avoids that bug by storing previous_index before each pass and verifying that at least one target character was matched.
Another important case is when the order of characters forces many subsequences. For example:
source = "abc"
target = "cba"
Although all characters exist in source, their required order in target conflicts with the order in source. The algorithm correctly restarts scanning whenever necessary and produces 3.
A third tricky case involves repeated characters. Consider:
source = "a"
target = "aaaaaa"
Each pass through source can match only one character. Some implementations incorrectly assume one pass can reuse characters multiple times. This solution correctly models each subsequence as a single left-to-right traversal of source, so it returns 6.
Another subtle edge case occurs when the entire target can already be formed in one pass. For example:
source = "abcde"
target = "ace"
The algorithm greedily matches all possible characters during the first traversal and correctly returns 1 without unnecessary extra passes.