LeetCode 2014 - Longest Subsequence Repeated k Times
The problem asks us to find the longest subsequence of a string s that can be repeated k times while still being a subsequence of s. A subsequence is derived by deleting zero or more characters from a string without changing the order of the remaining characters.
Difficulty: 🔴 Hard
Topics: Hash Table, Two Pointers, String, Backtracking, Counting, Enumeration
Solution
Problem Understanding
The problem asks us to find the longest subsequence of a string s that can be repeated k times while still being a subsequence of s. A subsequence is derived by deleting zero or more characters from a string without changing the order of the remaining characters. A subsequence seq repeated k times means that seq concatenated k times (seq * k) should itself be a subsequence of s.
The input consists of a string s with length n and an integer k (between 2 and 2000). The constraints tell us that n is relatively small, less than min(2001, k * 8). This hints that a combinatorial or backtracking approach could be feasible if carefully pruned.
The expected output is a string that is the longest subsequence repeated k times, or the lexicographically largest one if multiple sequences of the same length exist. If no such subsequence exists, we return an empty string.
Edge cases include:
- Strings with all identical characters.
- Strings where
kis larger than the number of repetitions of any character. - Strings where no subsequence can appear
ktimes. - Lexicographical ordering matters when multiple longest subsequences exist.
Approaches
Brute Force
The brute-force approach would enumerate all possible subsequences of s and check which can be repeated k times as a subsequence. This guarantees correctness, but the number of subsequences is 2^n, which is exponentially large, and each check for k repetitions could take O(n) time. Therefore, this approach is not feasible for the upper limit of n = 2000.
Optimal Approach
The key insight is that we can use backtracking combined with pruning using character frequency counts.
- Only characters appearing at least
ktimes are eligible to form part of the repeated subsequence. - We can construct candidate sequences using these eligible characters, exploring longer sequences first to ensure we find the longest one.
- We can generate candidate sequences in lexicographical descending order, ensuring that the first valid sequence we find is also the lexicographically largest.
- To efficiently check if a candidate sequence repeated
ktimes is a subsequence ofs, we can simulate two pointers onsand the candidate repeatedktimes.
This reduces the search space significantly because we ignore characters that cannot possibly satisfy the k repetition requirement.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n * n) | O(n) | Enumerates all subsequences and checks each for k repetitions |
| Optimal | O(26^(len(seq)) * n * k) | O(n * k) | Backtracking with pruning based on character counts and lexicographical ordering |
Algorithm Walkthrough
- Count the frequency of each character in
s. - Filter characters whose frequency is at least
k. These characters are eligible to appear in the repeated subsequence. - Sort eligible characters in descending order to prioritize lexicographically larger sequences.
- Perform a DFS/backtracking search to build candidate sequences:
- At each step, try appending an eligible character to the current sequence.
- Before recursing, check if the new candidate repeated
ktimes is a subsequence ofs. - If yes, continue recursion; if no, prune that branch.
- Track the longest valid sequence found.
- Return the longest sequence at the end.
Why it works: The pruning step ensures that we only explore sequences that can potentially appear k times. Sorting characters in descending order guarantees that the first valid sequence of a given length is lexicographically the largest. This approach efficiently narrows the combinatorial explosion of subsequences.
Python Solution
class Solution:
def longestSubsequenceRepeatedK(self, s: str, k: int) -> str:
from collections import Counter
def is_k_subsequence(candidate: str) -> bool:
i = 0 # Pointer for s
count = 0 # How many times candidate has been found
for char in s:
if char == candidate[i]:
i += 1
if i == len(candidate):
i = 0
count += 1
if count == k:
return True
return False
freq = Counter(s)
# Only keep characters that appear at least k times
chars = [c for c in freq if freq[c] >= k]
chars.sort(reverse=True) # Lexicographically descending
result = ""
def dfs(path: str):
nonlocal result
if len(path) > len(result):
if is_k_subsequence(path):
result = path
for c in chars:
candidate = path + c
if is_k_subsequence(candidate):
dfs(candidate)
dfs("")
return result
Implementation Notes: We first filter out characters that cannot appear k times, reducing the search space. The is_k_subsequence function checks whether the candidate repeated k times is a subsequence of s. DFS ensures exploration of all viable sequences, and lexicographical sorting guarantees we prioritize larger sequences.
Go Solution
func longestSubsequenceRepeatedK(s string, k int) string {
freq := make(map[byte]int)
for i := 0; i < len(s); i++ {
freq[s[i]]++
}
chars := []byte{}
for c, f := range freq {
if f >= k {
chars = append(chars, c)
}
}
sort.Slice(chars, func(i, j int) bool { return chars[i] > chars[j] })
var result string
var isKSubsequence func(candidate string) bool
isKSubsequence = func(candidate string) bool {
i := 0
count := 0
for j := 0; j < len(s); j++ {
if s[j] == candidate[i] {
i++
if i == len(candidate) {
i = 0
count++
if count == k {
return true
}
}
}
}
return false
}
var dfs func(path string)
dfs = func(path string) {
if len(path) > len(result) && isKSubsequence(path) {
result = path
}
for _, c := range chars {
candidate := path + string(c)
if isKSubsequence(candidate) {
dfs(candidate)
}
}
}
dfs("")
return result
}
Go-specific Notes: Go requires explicit conversion between byte and string when concatenating characters. The map structure handles frequency counting. Recursion works similarly, with careful handling of slice vs string concatenation.
Worked Examples
Example 1: s = "letsleetcode", k = 2
- Count frequencies:
{'l':2, 'e':4, 't':2, 's':1, 'c':1, 'o':1, 'd':1} - Eligible characters:
['l', 'e', 't'](frequency >= k) - DFS explores sequences like
l,e,t,le,lt,et,let,ete, ... letandeteare valid repeated subsequences.- Lexicographically largest:
"let"
Example 2: s = "bb", k = 2
- Frequencies:
{'b':2} - Eligible characters:
['b'] - DFS finds
"b"repeated twice is a subsequence. - Result:
"b"
Example 3: s = "ab", k = 2
- Frequencies:
{'a':1, 'b':1} - No character appears at least twice.
- Result:
""(empty string)
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(26^(len(seq)) * n * k) | DFS explores sequences of eligible characters. Each check takes O(n) to verify k repetitions. |
| Space | O(n * k) | Recursion stack and string copies for candidate sequences. |
The time complexity is reasonable because pruning and frequency filtering drastically reduce the search space, especially since n < 2000.
Test Cases
# Provided examples
assert Solution().longestSubsequenceRepeatedK("letsleetcode", 2) == "let" # longest repeated 2 times
assert Solution().longestSubsequenceRepeatedK("bb", 2) == "b" # repeated 2 times
assert Solution().longestSubsequenceRepeatedK("ab", 2) == "" # no subsequence
# Edge and boundary cases
assert Solution().longestSubsequenceRepeatedK("aaaaa", 2) == "aa" # repeated character
assert Solution().longestSubsequenceRepeatedK("abcabcabc", 3) == "abc" # whole sequence
assert Solution().longestSubsequenceRepeatedK("abcd", 2) == "" # no character repeated 2 times
assert Solution().longestSubsequenceRepeatedK("zxyxzyz", 2) == "zx" # lexicographically largest
| Test | Why |
|---|---|
| "lets |