LeetCode 2207 - Maximize Number of Subsequences in a String

This problem asks us to maximize the number of times a two-character string pattern occurs as a subsequence in a given string text, after inserting exactly one character.

LeetCode Problem 2207

Difficulty: 🟡 Medium
Topics: String, Greedy, Prefix Sum

Solution

Problem Understanding

This problem asks us to maximize the number of times a two-character string pattern occurs as a subsequence in a given string text, after inserting exactly one character. The character we can insert is restricted to being either pattern[0] or pattern[1], and it can be inserted at any position, including the beginning or end of text.

A subsequence is formed by deleting zero or more characters without changing the order of the remaining characters. Thus, if pattern = "ac" and text = "abdcdbc", the subsequence "ac" occurs multiple times, and inserting one 'a' or 'c' can increase the total count.

The constraints indicate that text can be up to 10^5 characters long, so any solution with quadratic complexity is too slow. The pattern is always length 2, which simplifies the problem because we only need to consider two types of insertions.

Important edge cases include when text already has all identical characters, when pattern[0] equals pattern[1], and when text is very short (length 1), since insertion choices differ in these scenarios.

Approaches

The brute-force approach would be to try inserting pattern[0] or pattern[1] at every possible position in text and count subsequences in each modified string. Counting subsequences can be done via a nested loop or dynamic programming. While this would eventually give the correct answer, it is too slow, with complexity O(n^2) for insertion positions multiplied by O(n) for counting, resulting in O(n^3) - unacceptable for n = 10^5.

The key insight for the optimal approach is that to maximize subsequences pattern[0]pattern[1], we only need to know the number of pattern[0] before each position and the number of pattern[1] after each position. Since we can only insert one character, the best option is either inserting pattern[0] at the beginning (so it pairs with all pattern[1] after) or pattern[1] at the end (so it pairs with all pattern[0] before). We count all existing subsequences efficiently in one pass and then consider the optimal insertion.

Approach Time Complexity Space Complexity Notes
Brute Force O(n^3) O(n) Insert at all positions and count subsequences
Optimal O(n) O(1) One pass to count subsequences and use prefix/postfix counts to determine best insertion

Algorithm Walkthrough

  1. Initialize a counter count_pattern0 to 0 to track the number of pattern[0] seen so far.

  2. Initialize current_subseq_count to 0 to count subsequences pattern[0]pattern[1].

  3. Iterate through each character ch in text:

  4. If ch == pattern[0], increment count_pattern0.

  5. If ch == pattern[1], add count_pattern0 to current_subseq_count because each previous pattern[0] can pair with this pattern[1].

  6. After traversing text, the existing subsequence count is current_subseq_count.

  7. Count total occurrences of pattern[0] as total_pattern0 and pattern[1] as total_pattern1.

  8. Compute two scenarios for inserting one character:

  9. Insert pattern[0] at the start: total subsequences = current_subseq_count + total_pattern1.

  10. Insert pattern[1] at the end: total subsequences = current_subseq_count + total_pattern0.

  11. Return the maximum of the two values.

Why it works: The invariant is that for a given pattern[0]pattern[1], each pattern[0] can pair with all pattern[1] after it. Inserting one pattern[0] at the beginning pairs it with all existing pattern[1], and inserting one pattern[1] at the end pairs it with all existing pattern[0]. No other position can produce more subsequences than these two.

Python Solution

class Solution:
    def maximumSubsequenceCount(self, text: str, pattern: str) -> int:
        count_pattern0 = 0
        subseq_count = 0
        for ch in text:
            if ch == pattern[0]:
                count_pattern0 += 1
            elif ch == pattern[1]:
                subseq_count += count_pattern0
        total_pattern0 = text.count(pattern[0])
        total_pattern1 = text.count(pattern[1])
        return subseq_count + max(total_pattern0, total_pattern1)

In this implementation, we first count subsequences in a single pass. count_pattern0 tracks how many pattern[0] characters have appeared so far, and whenever we see pattern[1], it forms a subsequence with all pattern[0] counted. We then compute the effect of inserting pattern[0] at the start or pattern[1] at the end and return the maximum.

Go Solution

func maximumSubsequenceCount(text string, pattern string) int64 {
    var countPattern0, subseqCount int64
    for _, ch := range text {
        if ch == rune(pattern[0]) {
            countPattern0++
        } else if ch == rune(pattern[1]) {
            subseqCount += countPattern0
        }
    }
    var totalPattern0, totalPattern1 int64
    for _, ch := range text {
        if ch == rune(pattern[0]) {
            totalPattern0++
        } else if ch == rune(pattern[1]) {
            totalPattern1++
        }
    }
    if totalPattern0 > totalPattern1 {
        return subseqCount + totalPattern0
    }
    return subseqCount + totalPattern1
}

In Go, we explicitly use int64 to prevent integer overflow for large inputs. Characters are converted to rune for comparison. The logic mirrors the Python solution.

Worked Examples

Example 1: text = "abdcdbc", pattern = "ac"

Step Character count_pattern0 subseq_count
1 a 1 0
2 b 1 0
3 d 1 0
4 c 1 1
5 d 1 1
6 b 1 1
7 c 1 2

Total pattern[0] = a = 1, pattern[1] = c = 2. Max insertion adds 2, final = 4.

Example 2: text = "aabb", pattern = "ab"

Step Character count_pattern0 subseq_count
1 a 1 0
2 a 2 0
3 b 2 2
4 b 2 4

Total pattern[0] = 2, pattern[1] = 2. Max insertion adds 2, final = 6.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Single pass to count subsequences and two counts, n = length of text
Space O(1) Only counters used, no extra data structures

The solution is linear and efficient for the constraint n <= 10^5.

Test Cases

# Provided examples
assert Solution().maximumSubsequenceCount("abdcdbc", "ac") == 4  # Example 1
assert Solution().maximumSubsequenceCount("aabb", "ab") == 6     # Example 2

# Edge cases
assert Solution().maximumSubsequenceCount("a", "aa") == 1        # Single character text
assert Solution().maximumSubsequenceCount("b", "ab") == 1        # Single character, insert at start
assert Solution().maximumSubsequenceCount("aaaa", "aa") == 10    # pattern[0] == pattern[1], choose insertion to maximize pairs
assert Solution().maximumSubsequenceCount("ababab", "ab") == 10  # Multiple subsequences, insertion at end or start
assert Solution().maximumSubsequenceCount("cccc", "ac") == 4     # Only pattern[1] exists, insert pattern[0] at start
assert Solution().maximumSubsequenceCount("abcd", "da") == 1     # Reverse pattern, insertion at start or end
Test Why
"abdcdbc", "ac" Example demonstrating optimal insertion
"aabb", "ab" Example demonstrating multiple subsequences
"a", "aa" Smallest input, insertion required