LeetCode 392 - Is Subsequence

The problem asks whether string s can be formed from string t by deleting some characters from t without changing the order of the remaining characters. A subsequence does not require characters to be adjacent. The only requirement is that the relative ordering stays the same.

LeetCode Problem 392

Difficulty: 🟢 Easy
Topics: Two Pointers, String, Dynamic Programming

Solution

Problem Understanding

The problem asks whether string s can be formed from string t by deleting some characters from t without changing the order of the remaining characters.

A subsequence does not require characters to be adjacent. The only requirement is that the relative ordering stays the same. For example, "abc" is a subsequence of "ahbgdc" because we can pick characters 'a', 'b', and 'c' in order. However, "axc" is not a subsequence because the character 'x' does not exist in the correct order within t.

The input consists of two strings:

  • s, the candidate subsequence
  • t, the source string we are checking against

The expected output is:

  • true if every character in s appears in t in the same relative order
  • false otherwise

The constraints are small:

  • s.length <= 100
  • t.length <= 10^4

These limits strongly suggest that a linear scan solution is sufficient. Since s is very short, we can efficiently attempt to match its characters one by one while traversing t.

Several edge cases are important:

  • s is empty, an empty string is always a subsequence of any string
  • t is empty while s is not, matching becomes impossible
  • Repeated characters, such as "aaa" inside "aaab"
  • Characters appearing out of order, such as "aec" in "abcde"
  • s being longer than t, which automatically makes matching impossible

Understanding these cases upfront helps avoid incorrect assumptions in the implementation.

Approaches

Brute Force Approach

A brute force solution would try every possible subsequence of t and check whether any of them equals s.

To generate subsequences, for each character in t, we either include it or exclude it. Since each position has two choices, the total number of subsequences is 2^n, where n is the length of t.

After generating each subsequence, we compare it against s.

This approach is correct because it exhaustively explores every possible subsequence. If s exists as a subsequence, it will eventually be found.

However, this approach is computationally infeasible. Even for moderate string sizes, the number of subsequences becomes enormous. Since t can have length 10^4, generating all subsequences is impossible within reasonable time or memory limits.

Optimal Approach, Two Pointers

The key observation is that we do not need to generate subsequences explicitly.

Instead, we can scan through t once while tracking how many characters from s we have successfully matched.

We use two pointers:

  • One pointer for s
  • One pointer for t

As we move through t, whenever we find a character matching the current character in s, we advance the pointer in s.

If we reach the end of s, then every character has been matched in order, so s is a subsequence.

This works because subsequences only require relative ordering, not contiguity. A single left to right scan is sufficient to verify that ordering.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^n * n) O(2^n) Generates every subsequence of t
Optimal O(n + m) O(1) Single pass using two pointers

Here:

  • n = len(t)
  • m = len(s)

Algorithm Walkthrough

Optimal Two Pointer Algorithm

  1. Initialize two pointers, i for string s and j for string t.

The pointer i tracks the next character we need to match in s. The pointer j scans through t. 2. Start traversing t from left to right.

We examine each character in t exactly once because subsequences depend only on ordering. 3. Compare s[i] with t[j].

If the characters match, we advance i because we successfully found the next required character from s. 4. Always advance j.

Whether characters match or not, we continue scanning through t because future characters may still help complete the subsequence. 5. Continue until either:

  • i == len(s), meaning every character in s has been matched
  • j == len(t), meaning we exhausted t
  1. Return whether i == len(s).

If true, the entire subsequence was matched in order.

Why it works

The algorithm maintains the invariant that all characters before index i in s have already been matched in correct order within the portion of t scanned so far.

Whenever a matching character is found, advancing i preserves this invariant. Since t is scanned left to right, the relative ordering of matched characters is guaranteed.

If i reaches the end of s, then every character has been matched in sequence, proving that s is a subsequence of t.

Python Solution

class Solution:
    def isSubsequence(self, s: str, t: str) -> bool:
        s_index = 0
        t_index = 0

        while s_index < len(s) and t_index < len(t):
            if s[s_index] == t[t_index]:
                s_index += 1

            t_index += 1

        return s_index == len(s)

The implementation directly follows the two pointer algorithm.

The variable s_index tracks the current character we are trying to match in s. The variable t_index scans through t.

Inside the loop, we compare the current characters:

if s[s_index] == t[t_index]:

When they match, we advance s_index because we successfully matched one more character from s.

Regardless of whether a match occurs, we always move t_index forward because we continue searching through t.

The loop stops when either:

  • All characters in s are matched
  • We reach the end of t

Finally, the return statement checks whether the entire subsequence was matched.

Go Solution

func isSubsequence(s string, t string) bool {
    sIndex := 0
    tIndex := 0

    for sIndex < len(s) && tIndex < len(t) {
        if s[sIndex] == t[tIndex] {
            sIndex++
        }

        tIndex++
    }

    return sIndex == len(s)
}

The Go implementation is almost identical to the Python version.

Strings in Go are byte slices internally. Since the problem guarantees lowercase English letters only, indexing directly with s[i] and t[j] is safe because each character occupies one byte.

No special handling for Unicode characters is necessary here.

Go does not distinguish between nil and empty strings in this context, so an empty string naturally works correctly with the pointer logic.

Worked Examples

Example 1

s = "abc"
t = "ahbgdc"

We trace the pointers step by step.

Step s_index t_index s[s_index] t[t_index] Match? Action
1 0 0 a a Yes Move both pointers
2 1 1 b h No Move t_index
3 1 2 b b Yes Move both pointers
4 2 3 c g No Move t_index
5 2 4 c d No Move t_index
6 2 5 c c Yes Move both pointers

At this point:

s_index == len(s)

So the result is true.

Example 2

s = "axc"
t = "ahbgdc"
Step s_index t_index s[s_index] t[t_index] Match? Action
1 0 0 a a Yes Move both pointers
2 1 1 x h No Move t_index
3 1 2 x b No Move t_index
4 1 3 x g No Move t_index
5 1 4 x d No Move t_index
6 1 5 x c No Move t_index

The scan ends without matching 'x'.

Since:

s_index != len(s)

the result is false.

Complexity Analysis

Measure Complexity Explanation
Time O(n + m) Each pointer moves at most once through its string
Space O(1) Only constant extra variables are used

The algorithm performs a single pass through both strings. Each character is examined at most once, making the runtime linear.

No additional data structures proportional to input size are allocated, so the space complexity remains constant.

Follow Up Discussion

The follow up asks what happens if there are many incoming queries for s, potentially billions of them, while t remains fixed.

In that scenario, repeatedly scanning all of t becomes inefficient.

A better strategy is preprocessing t.

We can build a mapping from each character to the sorted list of indices where that character appears in t.

For example:

t = "ahbgdc"

a -> [0]
h -> [1]
b -> [2]
g -> [3]
d -> [4]
c -> [5]

For each query string s, we use binary search to find the next valid occurrence position for each character.

This reduces each query from:

O(len(t))

to roughly:

O(len(s) * log(len(t)))

which is significantly faster when processing many subsequence checks.

Test Cases

solution = Solution()

assert solution.isSubsequence("abc", "ahbgdc") == True   # standard valid subsequence
assert solution.isSubsequence("axc", "ahbgdc") == False  # missing character
assert solution.isSubsequence("", "ahbgdc") == True      # empty s is always valid
assert solution.isSubsequence("", "") == True            # both empty
assert solution.isSubsequence("a", "") == False          # non-empty s with empty t
assert solution.isSubsequence("abc", "abc") == True      # exact same string
assert solution.isSubsequence("abc", "acb") == False     # wrong order
assert solution.isSubsequence("aaa", "aaab") == True     # repeated characters
assert solution.isSubsequence("aaaa", "aaab") == False   # insufficient repetitions
assert solution.isSubsequence("b", "c") == False         # single character mismatch
assert solution.isSubsequence("ace", "abcde") == True    # non-contiguous subsequence
assert solution.isSubsequence("aec", "abcde") == False   # ordering violation
assert solution.isSubsequence("longstring", "short") == False  # s longer than t
Test Why
"abc", "ahbgdc" Standard successful match
"axc", "ahbgdc" Missing required character
"", "ahbgdc" Empty subsequence case
"", "" Both strings empty
"a", "" Impossible match with empty source
"abc", "abc" Exact equality
"abc", "acb" Characters exist but wrong order
"aaa", "aaab" Repeated character handling
"aaaa", "aaab" Not enough repeated characters
"b", "c" Single character mismatch
"ace", "abcde" Valid non-contiguous subsequence
"aec", "abcde" Relative ordering violation
"longstring", "short" Candidate longer than source

Edge Cases

Empty Subsequence

An empty string is always a subsequence of any string because we can delete all characters and keep nothing.

This case can easily be mishandled if the implementation assumes at least one character exists. In this solution, the loop condition immediately fails when s is empty, and the final check:

s_index == len(s)

correctly returns True.

Source String is Empty

If t is empty but s is not, matching is impossible because there are no characters available to form the subsequence.

The implementation handles this naturally because the traversal loop never starts, leaving s_index unchanged. Since not all characters were matched, the function returns False.

Characters Exist but in Wrong Order

A common mistake is checking only whether all characters exist somewhere in t.

For example:

s = "aec"
t = "abcde"

All characters appear, but not in the required order.

The two pointer approach avoids this bug because it scans left to right and never moves backward. Once the scan passes 'e', it cannot later match 'c', correctly returning False.