LeetCode 1055 - Shortest Way to Form String

The problem asks us to construct the string target using the fewest possible subsequences of the string source. A subsequence preserves relative order, but characters do not need to be contiguous.

LeetCode Problem 1055

Difficulty: 🟡 Medium
Topics: Two Pointers, String, Binary Search, Greedy

Solution

Problem Understanding

The problem asks us to construct the string target using the fewest possible subsequences of the string source.

A subsequence preserves relative order, but characters do not need to be contiguous. For example, "ace" is a subsequence of "abcde" because we can delete "b" and "d" while keeping the order of the remaining characters.

We are allowed to reuse source multiple times. Each time we reuse it, we may select any subsequence from it. The goal is to determine the minimum number of such subsequences whose concatenation forms the entire target.

For example:

  • source = "abc"
  • target = "abcbc"

We can form:

  • "abc" from the first pass through source
  • "bc" from the second pass

So the answer is 2.

The important detail is that each subsequence must respect the order of characters in source. We cannot rearrange characters.

The constraints are relatively small:

  • 1 <= source.length, target.length <= 1000
  • lowercase English letters only

This means even an O(n * m) solution is perfectly acceptable, where:

  • n = len(source)
  • m = len(target)

However, exponential or heavily recursive brute force solutions would still be impractical.

Several edge cases are important:

  • A character exists in target but not in source, in which case the answer is immediately -1.
  • target may require many resets of the source pointer.
  • source may contain repeated characters.
  • The optimal solution is greedy, but only if implemented carefully.

The key challenge is determining when we must start a new subsequence and how to maximize the amount of target matched during each pass through source.

Approaches

Brute Force Approach

A brute force solution would attempt to generate all possible subsequences of source and repeatedly try combinations of them to form target.

For a string of length n, there are 2^n possible subsequences. Even with n = 1000, this is astronomically large and completely infeasible.

Another slightly better brute force idea is:

  1. Start from the beginning of target
  2. Scan through source
  3. Match as many characters as possible
  4. Restart scanning source
  5. Continue until target is fully matched

This already resembles the optimal greedy approach, and importantly, it avoids generating subsequences explicitly.

The reason this greedy method works is that during each pass through source, there is never a benefit to skipping a character that could help match the current target character. Since we want the minimum number of subsequences, each pass should consume as much of target as possible.

Key Insight

The core observation is:

During one traversal of source, we should greedily match as many characters from target as possible.

If we reach the end of source before fully matching target, we must start another subsequence.

This works because:

  • Every subsequence corresponds to one pass through source
  • Maximizing matches per pass minimizes the total number of passes

We can implement this efficiently with two pointers:

  • One pointer for source
  • One pointer for target

Whenever a character matches, we advance the target pointer.

If an entire pass through source fails to match even one new target character, then the current target character does not exist in source, and the answer is -1.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^n) O(2^n) Generates subsequences explicitly
Optimal Greedy Two Pointers O(n * m) O(1) Greedily consumes target during each pass

Algorithm Walkthrough

Optimal Greedy Two Pointer Algorithm

  1. Initialize a pointer target_index to 0.

This pointer tracks how many characters of target we have already constructed. 2. Initialize a counter subsequence_count to 0.

Every full pass through source represents using one subsequence. 3. While target_index is still inside target, begin a new pass through source.

Increment subsequence_count because we are starting another subsequence. 4. Store the current value of target_index before scanning source.

This helps detect whether we made any progress during this pass. 5. Traverse source from left to right.

For each character:

  • If it matches target[target_index], advance target_index
  • Continue scanning to greedily consume as much of target as possible
  1. After finishing the pass through source, check whether target_index changed.

If it did not change, then no character from the current target position exists in source.

In that case, return -1. 7. Repeat until target_index == len(target).

At this point, all target characters have been formed. 8. Return subsequence_count.

Why it works

The greedy strategy is optimal because every subsequence corresponds to one complete left-to-right traversal of source. During a traversal, matching fewer characters can never help reduce the total number of subsequences later. Therefore, greedily consuming as many target characters as possible during each pass always minimizes the number of passes required.

The algorithm also correctly detects impossible cases because if an entire scan through source fails to advance the target pointer, then the needed character does not exist anywhere in source.

Python Solution

class Solution:
    def shortestWay(self, source: str, target: str) -> int:
        target_index = 0
        subsequence_count = 0

        while target_index < len(target):
            previous_index = target_index

            for char in source:
                if target_index < len(target) and char == target[target_index]:
                    target_index += 1

            if previous_index == target_index:
                return -1

            subsequence_count += 1

        return subsequence_count

The implementation directly follows the greedy two pointer strategy.

The variable target_index tracks the current character in target that still needs to be matched.

The outer while loop represents repeatedly using subsequences of source. Every iteration corresponds to one complete pass through source.

Inside the loop, we store previous_index before scanning source. This allows us to detect whether any progress was made.

The for char in source loop greedily matches characters in order. Whenever the current source character matches the current target character, we advance target_index.

After the scan:

  • If target_index did not change, the current target character does not exist in source, so we return -1
  • Otherwise, we successfully formed part of the target, so we increment subsequence_count

Once target_index reaches the end of target, the entire target string has been constructed.

Go Solution

func shortestWay(source string, target string) int {
    targetIndex := 0
    subsequenceCount := 0

    for targetIndex < len(target) {
        previousIndex := targetIndex

        for i := 0; i < len(source); i++ {
            if targetIndex < len(target) &&
                source[i] == target[targetIndex] {
                targetIndex++
            }
        }

        if previousIndex == targetIndex {
            return -1
        }

        subsequenceCount++
    }

    return subsequenceCount
}

The Go implementation mirrors the Python logic almost exactly.

Since Go strings are byte slices internally, indexing with source[i] and target[targetIndex] works efficiently because the input contains only lowercase English letters.

No special handling for Unicode is required.

Go also does not require explicit bounds checking inside the loop because we already guard access with:

targetIndex < len(target)

The space complexity remains constant because only a few integer variables are used.

Worked Examples

Example 1

source = "abc"
target = "abcbc"

Initial state:

Variable Value
target_index 0
subsequence_count 0

First Pass Through Source

Source Char Target Needed Match? target_index After
a a Yes 1
b b Yes 2
c c Yes 3

After first pass:

Variable Value
target_index 3
subsequence_count 1

Matched "abc".

Second Pass Through Source

Remaining target substring is "bc".

Source Char Target Needed Match? target_index After
a b No 3
b b Yes 4
c c Yes 5

Now target_index == len(target).

Answer:

2

Example 2

source = "abc"
target = "acdbc"

First Pass

Source Char Target Needed Match? target_index After
a a Yes 1
b c No 1
c c Yes 2

Remaining target substring is "dbc".

Second Pass

Source Char Target Needed Match?
a d No
b d No
c d No

No progress was made.

Therefore:

  • 'd' does not exist in source
  • return -1

Example 3

source = "xyz"
target = "xzyxz"

First Pass

Source Char Target Needed Match? target_index After
x x Yes 1
y z No 1
z z Yes 2

Matched "xz".

Second Pass

Remaining target substring is "yxz".

Source Char Target Needed Match? target_index After
x y No 2
y y Yes 3
z x No 3

Matched "y".

Third Pass

Remaining target substring is "xz".

Source Char Target Needed Match? target_index After
x x Yes 4
y z No 4
z z Yes 5

Answer:

3

Complexity Analysis

Measure Complexity Explanation
Time O(n * m) Each pass scans source, and at most m target characters are matched
Space O(1) Only a few pointer variables are used

Let:

  • n = len(source)
  • m = len(target)

In the worst case, each pass through source may match only one target character. Since each pass costs O(n) and there can be up to m passes, the total complexity becomes O(n * m).

The algorithm uses constant extra space because it stores only integer counters and pointers.

Test Cases

solution = Solution()

assert solution.shortestWay("abc", "abcbc") == 2
# Basic example with two subsequences

assert solution.shortestWay("abc", "acdbc") == -1
# Impossible because 'd' does not exist in source

assert solution.shortestWay("xyz", "xzyxz") == 3
# Multiple resets required

assert solution.shortestWay("a", "aaaaa") == 5
# Single character repeated many times

assert solution.shortestWay("abc", "abc") == 1
# Entire target formed in one pass

assert solution.shortestWay("abc", "cba") == 3
# Order mismatch forces separate passes

assert solution.shortestWay("aaaa", "aaaaaaaa") == 2
# Repeated identical characters

assert solution.shortestWay("abc", "d") == -1
# Target character missing entirely

assert solution.shortestWay("abcde", "ace") == 1
# Standard subsequence case

assert solution.shortestWay("abab", "ababaab") == 2
# Repeated pattern matching

assert solution.shortestWay("z", "zzzzzz") == 6
# Smallest source size

assert solution.shortestWay("abc", "cccccc") == 6
# Only one usable character per pass
Test Why
"abc", "abcbc" Standard example
"abc", "acdbc" Impossible target
"xyz", "xzyxz" Multiple subsequences required
"a", "aaaaa" Single character repetition
"abc", "abc" One-pass construction
"abc", "cba" Reverse ordering challenge
"aaaa", "aaaaaaaa" Duplicate character handling
"abc", "d" Missing character detection
"abcde", "ace" Simple subsequence
"abab", "ababaab" Repeated pattern matching
"z", "zzzzzz" Minimum source length
"abc", "cccccc" Only one match per scan

Edge Cases

One important edge case occurs when target contains a character that does not exist in source. For example:

source = "abc"
target = "abd"

A naive implementation might enter an infinite loop repeatedly scanning source without making progress. This implementation avoids that bug by storing previous_index before each pass and verifying that at least one target character was matched.

Another important case is when the order of characters forces many subsequences. For example:

source = "abc"
target = "cba"

Although all characters exist in source, their required order in target conflicts with the order in source. The algorithm correctly restarts scanning whenever necessary and produces 3.

A third tricky case involves repeated characters. Consider:

source = "a"
target = "aaaaaa"

Each pass through source can match only one character. Some implementations incorrectly assume one pass can reuse characters multiple times. This solution correctly models each subsequence as a single left-to-right traversal of source, so it returns 6.

Another subtle edge case occurs when the entire target can already be formed in one pass. For example:

source = "abcde"
target = "ace"

The algorithm greedily matches all possible characters during the first traversal and correctly returns 1 without unnecessary extra passes.