LeetCode 2014 - Longest Subsequence Repeated k Times

LeetCode Problem 2014

Difficulty: 🔴 Hard
Topics: Hash Table, Two Pointers, String, Backtracking, Counting, Enumeration

Solution

Problem Understanding

The problem asks us to find the longest subsequence of a string s that can be repeated k times while still being a subsequence of s. A subsequence is derived by deleting zero or more characters from a string without changing the order of the remaining characters. A subsequence seq repeated k times means that seq concatenated k times (seq * k) should itself be a subsequence of s.

The input consists of a string s with length n and an integer k (between 2 and 2000). The constraints tell us that n is relatively small, less than min(2001, k * 8). This hints that a combinatorial or backtracking approach could be feasible if carefully pruned.

The expected output is a string that is the longest subsequence repeated k times, or the lexicographically largest one if multiple sequences of the same length exist. If no such subsequence exists, we return an empty string.

Edge cases include:

Strings with all identical characters.
Strings where k is larger than the number of repetitions of any character.
Strings where no subsequence can appear k times.
Lexicographical ordering matters when multiple longest subsequences exist.

Approaches

Brute Force

The brute-force approach would enumerate all possible subsequences of s and check which can be repeated k times as a subsequence. This guarantees correctness, but the number of subsequences is 2^n, which is exponentially large, and each check for k repetitions could take O(n) time. Therefore, this approach is not feasible for the upper limit of n = 2000.

Optimal Approach

The key insight is that we can use backtracking combined with pruning using character frequency counts.

Only characters appearing at least k times are eligible to form part of the repeated subsequence.
We can construct candidate sequences using these eligible characters, exploring longer sequences first to ensure we find the longest one.
We can generate candidate sequences in lexicographical descending order, ensuring that the first valid sequence we find is also the lexicographically largest.
To efficiently check if a candidate sequence repeated k times is a subsequence of s, we can simulate two pointers on s and the candidate repeated k times.

This reduces the search space significantly because we ignore characters that cannot possibly satisfy the k repetition requirement.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(2^n * n)	O(n)	Enumerates all subsequences and checks each for k repetitions
Optimal	O(26^(len(seq)) * n * k)	O(n * k)	Backtracking with pruning based on character counts and lexicographical ordering

Algorithm Walkthrough

Count the frequency of each character in s.
Filter characters whose frequency is at least k. These characters are eligible to appear in the repeated subsequence.
Sort eligible characters in descending order to prioritize lexicographically larger sequences.
Perform a DFS/backtracking search to build candidate sequences:

At each step, try appending an eligible character to the current sequence.
Before recursing, check if the new candidate repeated k times is a subsequence of s.
If yes, continue recursion; if no, prune that branch.

Track the longest valid sequence found.
Return the longest sequence at the end.

Why it works: The pruning step ensures that we only explore sequences that can potentially appear k times. Sorting characters in descending order guarantees that the first valid sequence of a given length is lexicographically the largest. This approach efficiently narrows the combinatorial explosion of subsequences.

Python Solution

class Solution:
    def longestSubsequenceRepeatedK(self, s: str, k: int) -> str:
        from collections import Counter

        def is_k_subsequence(candidate: str) -> bool:
            i = 0  # Pointer for s
            count = 0  # How many times candidate has been found
            for char in s:
                if char == candidate[i]:
                    i += 1
                    if i == len(candidate):
                        i = 0
                        count += 1
                        if count == k:
                            return True
            return False

        freq = Counter(s)
        # Only keep characters that appear at least k times
        chars = [c for c in freq if freq[c] >= k]
        chars.sort(reverse=True)  # Lexicographically descending

        result = ""

        def dfs(path: str):
            nonlocal result
            if len(path) > len(result):
                if is_k_subsequence(path):
                    result = path
            for c in chars:
                candidate = path + c
                if is_k_subsequence(candidate):
                    dfs(candidate)

        dfs("")
        return result

Implementation Notes: We first filter out characters that cannot appear k times, reducing the search space. The is_k_subsequence function checks whether the candidate repeated k times is a subsequence of s. DFS ensures exploration of all viable sequences, and lexicographical sorting guarantees we prioritize larger sequences.

Go Solution

func longestSubsequenceRepeatedK(s string, k int) string {
    freq := make(map[byte]int)
    for i := 0; i < len(s); i++ {
        freq[s[i]]++
    }

    chars := []byte{}
    for c, f := range freq {
        if f >= k {
            chars = append(chars, c)
        }
    }

    sort.Slice(chars, func(i, j int) bool { return chars[i] > chars[j] })

    var result string

    var isKSubsequence func(candidate string) bool
    isKSubsequence = func(candidate string) bool {
        i := 0
        count := 0
        for j := 0; j < len(s); j++ {
            if s[j] == candidate[i] {
                i++
                if i == len(candidate) {
                    i = 0
                    count++
                    if count == k {
                        return true
                    }
                }
            }
        }
        return false
    }

    var dfs func(path string)
    dfs = func(path string) {
        if len(path) > len(result) && isKSubsequence(path) {
            result = path
        }
        for _, c := range chars {
            candidate := path + string(c)
            if isKSubsequence(candidate) {
                dfs(candidate)
            }
        }
    }

    dfs("")
    return result
}

Go-specific Notes: Go requires explicit conversion between byte and string when concatenating characters. The map structure handles frequency counting. Recursion works similarly, with careful handling of slice vs string concatenation.

Worked Examples

Example 1: s = "letsleetcode", k = 2

Count frequencies: {'l':2, 'e':4, 't':2, 's':1, 'c':1, 'o':1, 'd':1}
Eligible characters: ['l', 'e', 't'] (frequency >= k)
DFS explores sequences like l, e, t, le, lt, et, let, ete, ...
let and ete are valid repeated subsequences.
Lexicographically largest: "let"

Example 2: s = "bb", k = 2

Frequencies: {'b':2}
Eligible characters: ['b']
DFS finds "b" repeated twice is a subsequence.
Result: "b"

Example 3: s = "ab", k = 2

Frequencies: {'a':1, 'b':1}
No character appears at least twice.
Result: "" (empty string)

Complexity Analysis

Measure	Complexity	Explanation
Time	O(26^(len(seq)) * n * k)	DFS explores sequences of eligible characters. Each check takes O(n) to verify k repetitions.
Space	O(n * k)	Recursion stack and string copies for candidate sequences.

The time complexity is reasonable because pruning and frequency filtering drastically reduce the search space, especially since n < 2000.

Test Cases

# Provided examples
assert Solution().longestSubsequenceRepeatedK("letsleetcode", 2) == "let"  # longest repeated 2 times
assert Solution().longestSubsequenceRepeatedK("bb", 2) == "b"  # repeated 2 times
assert Solution().longestSubsequenceRepeatedK("ab", 2) == ""  # no subsequence

# Edge and boundary cases
assert Solution().longestSubsequenceRepeatedK("aaaaa", 2) == "aa"  # repeated character
assert Solution().longestSubsequenceRepeatedK("abcabcabc", 3) == "abc"  # whole sequence
assert Solution().longestSubsequenceRepeatedK("abcd", 2) == ""  # no character repeated 2 times
assert Solution().longestSubsequenceRepeatedK("zxyxzyz", 2) == "zx"  # lexicographically largest

Test	Why
"lets