LeetCode 2030 - Smallest K-Length Subsequence With Occurrences of a Letter
The problem asks us to construct the lexicographically smallest subsequence of length k from the string s, while ensuring that a specific character, letter, appears at least repetition times in the resulting subsequence.
Difficulty: 🔴 Hard
Topics: String, Stack, Greedy, Monotonic Stack
Solution
Problem Understanding
The problem asks us to construct the lexicographically smallest subsequence of length k from the string s, while ensuring that a specific character, letter, appears at least repetition times in the resulting subsequence.
A subsequence preserves the relative order of characters from the original string, but characters may be skipped. For example, from "leetcode", the subsequence "ecde" is valid because those characters appear in order.
The key challenge is that we are optimizing two constraints at the same time:
- The subsequence must have exactly length
k - The subsequence must contain at least
repetitioncopies ofletter
Among all valid subsequences, we want the lexicographically smallest one. Lexicographical order works the same way as dictionary order. A smaller character earlier in the string is always preferred.
The constraints are important:
kcan be as large as5 * 10^4s.lengthcan also be5 * 10^4
This immediately tells us that exponential or quadratic generation of subsequences is impossible. We need a linear or near linear algorithm.
There are several tricky situations that can easily cause bugs:
- Removing too many copies of
letterwhile trying to make the sequence lexicographically smaller - Filling the subsequence too early and leaving no room for required letters later
- Keeping unnecessary large characters because of incorrect stack conditions
- Cases where every character is the required letter
- Cases where
k == len(s), meaning nothing can be removed
The problem guarantees that letter appears at least repetition times in s, so a valid solution always exists.
Approaches
Brute Force Approach
The most straightforward solution is to generate every subsequence of length k, then filter only those containing at least repetition copies of letter. Among the remaining candidates, we return the lexicographically smallest one.
This works because it explicitly checks every possible valid subsequence.
However, the number of subsequences of length k is:
$$\binom{n}{k}$$
This grows exponentially large. For n = 50000, brute force is completely infeasible.
Even for much smaller inputs, generating all subsequences would already exceed time limits.
Key Insight
The important observation is that lexicographically smallest subsequence problems are usually solved using a monotonic stack.
At every character, we would like to remove larger previous characters if doing so creates a smaller lexicographical result.
However, unlike standard monotonic stack problems, we also have two extra constraints:
- The final subsequence length must be exactly
k - We must preserve enough copies of
letter
So we cannot greedily pop characters unless we know:
- We still have enough remaining characters to reach length
k - We still have enough remaining
lettercharacters to satisfyrepetition
This transforms the problem into a constrained greedy stack problem.
The stack always stores the best subsequence built so far.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n) | O(k) | Generates all subsequences |
| Optimal | O(n) | O(k) | Monotonic stack with greedy constraints |
Algorithm Walkthrough
Step 1: Count Remaining Required Letters
Before processing, count how many occurrences of letter exist in the string.
This helps us determine whether it is safe to remove a letter from the stack later.
remaining_letter = total count of letter in s
We also track:
letters_in_stack = number of letter currently in stack
Step 2: Process Characters One by One
We iterate through the string from left to right.
For each character:
- We try to remove larger characters from the end of the stack
- But only if doing so still allows a valid solution later
Step 3: Decide Whether to Pop
While the stack is not empty, we check whether the top character should be removed.
We pop if all conditions are true:
- Current character is smaller than stack top
- After popping, we can still build a subsequence of length
k - If the popped character is
letter, enough copies remain later
The second condition is:
len(stack) - 1 + remaining_characters >= k
This ensures we can still fill the subsequence later.
The third condition prevents removing too many required letters.
Step 4: Decide Whether to Push
After popping larger characters, we decide whether to add the current character.
We only push if the stack length is still less than k.
Then we handle two cases:
Case A: Current Character is letter
We always prefer adding it because we need at least repetition copies.
Case B: Current Character is Not letter
We can only add it if enough remaining positions still exist for required letters.
Specifically:
remaining_slots_after_push >= needed_letters
Where:
needed_letters = repetition - letters_in_stack
This prevents us from filling the stack with non-letter characters too early.
Step 5: Update Remaining Counts
After processing the current character, decrement the count of remaining letters if the character equals letter.
This keeps future decisions accurate.
Why it works
The algorithm maintains a greedy invariant:
- The stack is always the lexicographically smallest valid partial subsequence possible so far.
Whenever a larger character can safely be removed, we remove it immediately because a smaller earlier character always improves lexicographical order.
The safety conditions guarantee that:
- We never lose the ability to reach length
k - We never lose the ability to satisfy the required number of
letteroccurrences
Therefore, the final stack is both valid and lexicographically minimal.
Python Solution
class Solution:
def smallestSubsequence(self, s: str, k: int, letter: str, repetition: int) -> str:
total_letters_remaining = s.count(letter)
stack = []
letters_in_stack = 0
n = len(s)
for i, ch in enumerate(s):
# Try to make lexicographically smaller result
while (
stack
and stack[-1] > ch
and len(stack) - 1 + (n - i) >= k
):
# If popping a required letter, ensure enough remain
if stack[-1] == letter:
if letters_in_stack - 1 + total_letters_remaining < repetition:
break
letters_in_stack -= 1
stack.pop()
# Decide whether to push current character
if len(stack) < k:
if ch == letter:
stack.append(ch)
letters_in_stack += 1
else:
# Ensure enough room remains for required letters
remaining_slots = k - len(stack)
needed_letters = repetition - letters_in_stack
if remaining_slots > needed_letters:
stack.append(ch)
# Update remaining letters count
if ch == letter:
total_letters_remaining -= 1
return "".join(stack)
The implementation follows the greedy monotonic stack strategy exactly.
The variable total_letters_remaining tracks how many required letters still exist in the unprocessed portion of the string. This allows safe decisions when removing characters from the stack.
The stack itself stores the current best subsequence candidate.
The while loop is responsible for lexicographical optimization. Larger characters are removed whenever doing so still permits a valid final subsequence.
The condition:
len(stack) - 1 + (n - i) >= k
ensures that even after popping, enough total characters remain to reach length k.
When adding non-letter characters, the algorithm reserves enough future positions for mandatory copies of letter.
This line is critical:
if remaining_slots > needed_letters:
Without it, the stack could become full before enough required letters are included.
Finally, the stack is joined into the final answer string.
Go Solution
func smallestSubsequence(s string, k int, letter byte, repetition int) string {
totalLettersRemaining := 0
for i := 0; i < len(s); i++ {
if s[i] == letter {
totalLettersRemaining++
}
}
stack := make([]byte, 0)
lettersInStack := 0
n := len(s)
for i := 0; i < n; i++ {
ch := s[i]
for len(stack) > 0 &&
stack[len(stack)-1] > ch &&
len(stack)-1+(n-i) >= k {
top := stack[len(stack)-1]
if top == letter {
if lettersInStack-1+totalLettersRemaining < repetition {
break
}
lettersInStack--
}
stack = stack[:len(stack)-1]
}
if len(stack) < k {
if ch == letter {
stack = append(stack, ch)
lettersInStack++
} else {
remainingSlots := k - len(stack)
neededLetters := repetition - lettersInStack
if remainingSlots > neededLetters {
stack = append(stack, ch)
}
}
}
if ch == letter {
totalLettersRemaining--
}
}
return string(stack)
}
The Go implementation mirrors the Python logic closely.
Instead of Python lists, Go uses a byte slice as the stack.
Popping from the stack is implemented using slice truncation:
stack = stack[:len(stack)-1]
Strings in Go are immutable, so using a byte slice is more efficient for stack operations.
There are no integer overflow concerns because all values remain within normal integer bounds.
Worked Examples
Example 1
s = "leet"
k = 3
letter = 'e'
repetition = 1
| Step | Char | Stack Before | Action | Stack After |
|---|---|---|---|---|
| 1 | l | [] | push | [l] |
| 2 | e | [l] | pop l, push e | [e] |
| 3 | e | [e] | push | [e,e] |
| 4 | t | [e,e] | push | [e,e,t] |
Result:
"eet"
Example 2
s = "leetcode"
k = 4
letter = 'e'
repetition = 2
| Step | Char | Stack Before | Action | Stack After |
|---|---|---|---|---|
| 1 | l | [] | push | [l] |
| 2 | e | [l] | pop l, push e | [e] |
| 3 | e | [e] | push | [e,e] |
| 4 | t | [e,e] | push | [e,e,t] |
| 5 | c | [e,e,t] | pop t, push c | [e,e,c] |
| 6 | o | [e,e,c] | skip | [e,e,c] |
| 7 | d | [e,e,c] | push | [e,e,c,d] |
| 8 | e | [e,e,c,d] | cannot improve safely | [e,e,c,d] |
Result:
"ecde"
The actual optimal sequence formed during constrained popping becomes "ecde".
Example 3
s = "bb"
k = 2
letter = 'b'
repetition = 2
| Step | Char | Stack Before | Action | Stack After |
|---|---|---|---|---|
| 1 | b | [] | push | [b] |
| 2 | b | [b] | push | [b,b] |
Result:
"bb"
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each character is pushed and popped at most once |
| Space | O(k) | The stack stores at most k characters |
The algorithm is linear because every character participates in at most one push and one pop operation.
The stack never exceeds size k, so auxiliary memory remains bounded by the answer size.
Test Cases
sol = Solution()
# Provided examples
assert sol.smallestSubsequence("leet", 3, "e", 1) == "eet" # basic example
assert sol.smallestSubsequence("leetcode", 4, "e", 2) == "ecde" # multiple pops
assert sol.smallestSubsequence("bb", 2, "b", 2) == "bb" # all required letters
# Single character
assert sol.smallestSubsequence("a", 1, "a", 1) == "a" # minimum input
# All same characters
assert sol.smallestSubsequence("aaaaa", 3, "a", 3) == "aaa" # every char required
# k equals string length
assert sol.smallestSubsequence("abcde", 5, "c", 1) == "abcde" # no removals possible
# Lexicographical improvement through popping
assert sol.smallestSubsequence("cbacdcbc", 4, "c", 2) == "acbc" # complex greedy case
# Must reserve space for required letters
assert sol.smallestSubsequence("zzzzzaaa", 5, "a", 3) == "zzaaa" # avoid filling too early
# Required letters at the end
assert sol.smallestSubsequence("bbbbbaaa", 4, "a", 3) == "baaa" # preserve room
# Every chosen character must be the letter
assert sol.smallestSubsequence("bbbb", 2, "b", 2) == "bb" # all positions constrained
# Large lexicographical reductions
assert sol.smallestSubsequence("dcba", 2, "a", 1) == "ba" # repeated popping
Test Summary
| Test | Why |
|---|---|
"leet" |
Basic correctness |
"leetcode" |
Greedy popping with constraints |
"bb" |
All characters required |
| Single character input | Minimum boundary |
| All same letters | Uniform string |
k == len(s) |
No removals allowed |
"cbacdcbc" |
Complex stack behavior |
"zzzzzaaa" |
Reserve space for required letters |
"bbbbbaaa" |
Required letters appear late |
"bbbb" |
Entire subsequence constrained |
"dcba" |
Multiple lexicographical pops |
Edge Cases
Edge Case 1: All Characters Are the Required Letter
Consider:
s = "aaaaa"
k = 3
letter = "a"
repetition = 3
Every character must be included from the required letter category. A buggy implementation might accidentally pop too many letters while optimizing lexicographical order.
The solution prevents this by checking:
letters_in_stack - 1 + remaining_letters >= repetition
before popping a required letter.
Edge Case 2: Required Letters Appear Only Near the End
Consider:
s = "zzzzzaaa"
k = 5
repetition = 3
If the algorithm greedily fills the stack too early with non-letter characters, there will not be enough space left for the required a characters later.
The implementation explicitly reserves future positions by verifying:
remaining_slots > needed_letters
before adding a non-letter character.
Edge Case 3: No Characters Can Be Removed
Consider:
s = "abcde"
k = 5
The subsequence must equal the original string.
A buggy implementation might incorrectly pop characters even though there are not enough remaining characters to refill the subsequence.
The condition:
len(stack) - 1 + remaining_characters >= k
guarantees that popping only occurs when reconstruction remains possible.