LeetCode 3458 - Select K Disjoint Special Substrings

The problem asks us to determine whether we can select k disjoint special substrings from a given string s. A special substring is defined as a substring where every character in it does not appear anywhere else in the string.

LeetCode Problem 3458

Difficulty: 🟡 Medium
Topics: Hash Table, String, Dynamic Programming, Greedy, Sorting

Solution

Problem Understanding

The problem asks us to determine whether we can select k disjoint special substrings from a given string s. A special substring is defined as a substring where every character in it does not appear anywhere else in the string. Additionally, the substring cannot be the entire string itself.

The input consists of a string s of length n and an integer k. The output is a boolean value: true if we can select exactly k disjoint special substrings and false otherwise.

Constraints tell us that the string can be up to 50,000 characters long and only contains lowercase letters. Since k can be at most 26 (number of lowercase English letters), we know that we cannot have more than 26 disjoint special substrings.

Key edge cases include: k = 0, strings where all characters are unique (allowing multiple special substrings), and strings where characters are heavily repeated, making it impossible to form even a single special substring.

Approaches

A brute-force approach would attempt to generate all possible substrings, check for each if it is "special" by verifying that none of its characters appear outside the substring, and then select k disjoint ones. This is clearly too slow because generating all substrings is O(n²), and checking each substring for uniqueness in the rest of the string is O(n), leading to an overall O(n³) complexity.

The key observation for an optimal approach is that the positions of the first and last occurrences of each character fully define potential special substrings. For each character, if we know its first and last index, any substring that contains that character but does not extend beyond these indices is guaranteed to be "special" with respect to that character. By iterating through characters and merging overlapping ranges, we can count the number of disjoint special substrings efficiently.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n³) O(n²) Generate all substrings and check uniqueness, too slow for n up to 50,000
Optimal O(n) O(26) ≈ O(1) Use first/last occurrence to define ranges and count disjoint segments

Algorithm Walkthrough

  1. Create two dictionaries, first_index and last_index, to record the first and last position of each character in the string. This lets us know the complete span of each character.
  2. Iterate through the string. For each character at index i, determine the rightmost index for the current potential special substring as the maximum last_index of all characters in the substring so far.
  3. Whenever i reaches this rightmost index, it means we have found a valid special substring. Increment a counter for special substrings and start a new substring range from i + 1.
  4. After traversing the string, check if the count of found special substrings is at least k. Return true if so, otherwise false.

This works because a special substring is fully determined by the characters it contains and their global spans. By greedily taking the leftmost valid segment, we ensure maximal count of disjoint special substrings.

Python Solution

class Solution:
    def maxSubstringLength(self, s: str, k: int) -> bool:
        if k == 0:
            return True
        
        first_index = {}
        last_index = {}
        n = len(s)
        
        for i, c in enumerate(s):
            if c not in first_index:
                first_index[c] = i
            last_index[c] = i
        
        count = 0
        i = 0
        
        while i < n:
            end = last_index[s[i]]
            j = i
            while j <= end:
                end = max(end, last_index[s[j]])
                j += 1
            count += 1
            i = j
        
        return count >= k

The code first handles the k = 0 case. Then it records first and last occurrences of each character. Using a greedy pointer, it extends the current substring to the farthest last_index of characters seen, ensuring that all characters in the substring are contained entirely within. Each time the substring completes, the count is incremented, and we start a new substring.

Go Solution

func maxSubstringLength(s string, k int) bool {
    if k == 0 {
        return true
    }
    
    firstIndex := make(map[rune]int)
    lastIndex := make(map[rune]int)
    n := len(s)
    
    for i, c := range s {
        if _, ok := firstIndex[c]; !ok {
            firstIndex[c] = i
        }
        lastIndex[c] = i
    }
    
    count := 0
    i := 0
    
    for i < n {
        end := lastIndex[rune(s[i])]
        j := i
        for j <= end {
            if lastIndex[rune(s[j])] > end {
                end = lastIndex[rune(s[j])]
            }
            j++
        }
        count++
        i = j
    }
    
    return count >= k
}

The Go solution follows the same logic as Python. Differences include explicit type conversions from byte to rune and using maps instead of dictionaries. The loop structure and greedy extension remain identical.

Worked Examples

Example 1: s = "abcdbaefab", k = 2

Step i end count Comment
Initial 0 3 0 first substring starts at 0
Extend 0..3 3 1 substring "cd" identified
Next 4 7 2 substring "ef" identified
Result 2 >= 2 return true

Example 2: s = "cdefdc", k = 3

Step i end count Comment
Initial 0 5 0 full string spans c..c
Extend 0..5 5 1 only 1 substring possible
Result 1 < 3 return false

Example 3: s = "abeabe", k = 0

Directly returns true as k=0.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character is visited at most twice while extending substring ranges
Space O(26) ≈ O(1) Maps store first and last indices for lowercase letters only

The linear time is guaranteed because each substring extension uses a greedy pointer without revisiting processed indices. Space is effectively constant due to the limited alphabet.

Test Cases

# Provided examples
assert Solution().maxSubstringLength("abcdbaefab", 2) == True  # standard case with multiple substrings
assert Solution().maxSubstringLength("cdefdc", 3) == False      # not enough disjoint substrings
assert Solution().maxSubstringLength("abeabe", 0) == True       # k = 0 edge case

# Boundary and edge cases
assert Solution().maxSubstringLength("a", 0) == True             # minimum length string
assert Solution().maxSubstringLength("a", 1) == False            # cannot pick entire string
assert Solution().maxSubstringLength("abcde", 5) == True         # each char unique, max substrings
assert Solution().maxSubstringLength("aaaaa", 1) == False        # all same char, cannot form special substring
assert Solution().maxSubstringLength("abac", 2) == True          # pick "b" and "c" as disjoint substrings
Test Why
"abcdbaefab", 2 verifies basic multi-substring selection
"cdefdc", 3 verifies insufficient special substrings
"abeabe", 0 verifies k=0 returns True
"a", 0 smallest string edge case
"a", 1 cannot pick entire string
"abcde", 5 max number of disjoint single-character substrings
"aaaaa", 1 repeated character cannot form special substring
"abac", 2 overlapping first/last occurrence handled correctly

Edge Cases

  1. k = 0: This is a special case because we do not need any substrings. A naive implementation might attempt to process the string unnecessarily, but here we handle it at the start and immediately return true.
  2. String with all identical characters: If s = "aaaaa" and k > 0, no substring can be special since any substring's character appears elsewhere. The implementation correctly counts zero and returns false.
  3. Substrings overlapping character ranges: For a string like "abac", the first and last occurrence of 'a' spans multiple positions. A naive approach might incorrectly split substrings, but our greedy expansion ensures each substring ends at the maximum last_index of its contained characters, maintaining disjointness and correctness. We arek = 3`

Intervals:

[2,2], [3,3]

Greedy selection gives maximum count = 2 < k = 3 → false

Example 3

Input: s = "abeabe", `k