LeetCode 2486 - Append Characters to String to Make Subsequence

The problem asks us to determine the minimum number of characters that need to be appended to the end of string s so that string t becomes a subsequence of s. A subsequence is a sequence that appears in the same order as in another string, but not necessarily consecutively.

LeetCode Problem 2486

Difficulty: 🟡 Medium
Topics: Two Pointers, String, Greedy

Solution

Problem Understanding

The problem asks us to determine the minimum number of characters that need to be appended to the end of string s so that string t becomes a subsequence of s. A subsequence is a sequence that appears in the same order as in another string, but not necessarily consecutively. Essentially, we are allowed to extend s by adding characters at its end, and our goal is to find the smallest number of characters to append such that every character in t can appear in order in the extended s.

The input consists of two strings s and t, both containing only lowercase English letters. The output is an integer representing the number of characters to append to s.

The constraints indicate that both s and t can be quite large, up to 10^5 characters, which means any solution with a time complexity worse than O(n + m) (where n and m are the lengths of s and t) is unlikely to be efficient enough.

Important edge cases include when t is already a subsequence of s, when s is empty or very short, and when t contains characters that do not appear at all in s.

Approaches

A brute-force approach would attempt to generate all possible extensions of s by appending characters from t and checking if t becomes a subsequence. This is clearly infeasible because the number of combinations grows exponentially with the length of t.

The optimal approach leverages the fact that we only need to check how many characters of t are already a subsequence in s. By iterating over both strings using a two-pointer technique, we can determine the longest prefix of t that is already a subsequence of s. The remaining characters in t after this prefix are precisely the characters that must be appended. This solution is linear in the length of both strings.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^m * n) O(n + m) Generate all extensions and check subsequence, infeasible
Optimal O(n + m) O(1) Two-pointer approach to find longest subsequence prefix

Algorithm Walkthrough

  1. Initialize two pointers, i for s and j for t, both starting at 0.
  2. Iterate over s using pointer i. For each character s[i], check if it matches the current character t[j].
  3. If s[i] == t[j], increment j to move to the next character in t. This tracks how many characters of t have been matched as a subsequence.
  4. Increment i in every iteration to move through s.
  5. After iterating through s, j will indicate the number of characters in t that are already matched in s.
  6. The minimum number of characters to append is the remaining length of t after j, i.e., len(t) - j.

Why it works: The two-pointer method ensures we check each character of s exactly once while maintaining the subsequence order for t. By the end of the traversal, any unmatched portion of t must be appended to s to make t a subsequence.

Python Solution

class Solution:
    def appendCharacters(self, s: str, t: str) -> int:
        i, j = 0, 0
        while i < len(s) and j < len(t):
            if s[i] == t[j]:
                j += 1
            i += 1
        return len(t) - j

The code initializes two pointers i and j to traverse s and t respectively. Whenever a matching character is found, j is incremented to track the number of matched characters in t. After the loop, len(t) - j gives the number of characters that must be appended to s to make t a subsequence.

Go Solution

func appendCharacters(s string, t string) int {
    i, j := 0, 0
    for i < len(s) && j < len(t) {
        if s[i] == t[j] {
            j++
        }
        i++
    }
    return len(t) - j
}

The Go implementation mirrors the Python version. Pointers i and j traverse s and t. The loop continues until the end of either string, incrementing j for every match. Finally, the remaining unmatched portion of t is returned as the answer.

Worked Examples

Example 1: s = "coaching", t = "coding"

i s[i] j t[j] Action
0 c 0 c Match, j = 1
1 o 1 o Match, j = 2
2 a 2 d No match
3 c 2 d No match
4 h 2 d No match
5 i 2 d No match
6 n 2 d No match
7 g 2 d No match

After s is fully traversed, j = 2. Remaining characters to append = len(t) - j = 6 - 2 = 4.

Example 2: s = "abcde", t = "a"

Pointer j moves immediately as s[0] == t[0]. j = 1, remaining characters = 0.

Example 3: s = "z", t = "abcde"

No matches occur. j = 0, remaining characters = 5.

Complexity Analysis

Measure Complexity Explanation
Time O(n + m) Each character of s and t is visited at most once with two pointers.
Space O(1) Only two integer pointers are used, no additional data structures.

The algorithm efficiently handles large strings due to its linear scan and constant space usage.

Test Cases

# Provided examples
assert Solution().appendCharacters("coaching", "coding") == 4  # Subsequence needs 4 characters appended
assert Solution().appendCharacters("abcde", "a") == 0          # Already a subsequence
assert Solution().appendCharacters("z", "abcde") == 5          # Entire t needs appending

# Edge and additional cases
assert Solution().appendCharacters("", "abc") == 3             # Empty s, append entire t
assert Solution().appendCharacters("abc", "") == 0             # Empty t, nothing to append
assert Solution().appendCharacters("abc", "abc") == 0          # t is exactly s
assert Solution().appendCharacters("abc", "abd") == 1          # Only last character missing
assert Solution().appendCharacters("xyz", "xyzabc") == 3       # Append 'abc' to match
Test Why
"coaching", "coding" Validates partial subsequence match
"abcde", "a" Validates when t is already subsequence
"z", "abcde" Validates when no match exists
"", "abc" Validates empty s
"abc", "" Validates empty t
"abc", "abc" Exact match scenario
"abc", "abd" Last character missing
"xyz", "xyzabc" Need to append multiple characters after full prefix match

Edge Cases

  1. Empty s: If s is empty, the solution must append the entire string t. The two-pointer approach correctly handles this because the loop never increments j, so len(t) - j = len(t).
  2. Empty t: If t is empty, no characters need to be appended. The loop immediately exits, and len(t) - j = 0.
  3. t already a subsequence of s: If t is fully present in s in order, the algorithm correctly identifies that j reaches len(t) during iteration, yielding len(t) - j = 0, so no appends are needed.
  4. No characters of t in s: If s shares no characters with t, j remains 0 throughout, and the algorithm correctly returns len(t) as the number of characters to append. This prevents incorrect assumptions about partial matches.

This approach is robust and covers all edge cases within the linear time and constant space constraints.