LeetCode 2405 - Optimal Partition of String
The problem asks us to divide a given string s into the minimum number of substrings such that every substring contains only unique characters. In other words, within a single substring, no character may appear more than once.
Difficulty: 🟡 Medium
Topics: Hash Table, String, Greedy
Solution
Problem Understanding
The problem asks us to divide a given string s into the minimum number of substrings such that every substring contains only unique characters. In other words, within a single substring, no character may appear more than once.
We are given a string of lowercase English letters and must split it into contiguous pieces. Every character in the original string must belong to exactly one substring, and the order of characters cannot change. The challenge is to find the partitioning that produces the smallest possible number of substrings.
Another way to think about the problem is this: we scan the string from left to right and keep building a substring as long as all characters remain unique. The moment adding a new character would create a duplicate inside the current substring, we must cut the partition and begin a new substring.
For example, in "abacaba":
- Start with
"a" - Add
"b"→"ab"is valid - Add
"a"→ duplicate appears, so we must split before thisa - Continue similarly until the full string is processed
The output is a single integer representing the minimum number of valid substrings required.
The constraints tell us important information about the expected efficiency:
1 <= s.length <= 10^5scontains only lowercase English letters
Since the input size can be as large as 100,000, any solution with quadratic complexity such as O(n²) will likely be too slow. We should aim for a linear time solution, ideally O(n).
The fact that the string contains only lowercase English letters is also significant. Since there are only 26 possible characters, we can track uniqueness efficiently using a hash set or even a bitmask.
Several edge cases are worth identifying upfront. A string with only one character should return 1 because no partitioning is necessary. A string where every character is unique, such as "abcdef", should also return 1. At the opposite extreme, a string of repeated characters like "ssssss" forces every character into its own substring, resulting in 6. Cases where duplicates appear frequently, such as "abac", can trip up naive implementations if partitions are not created at exactly the right moment.
Approaches
Brute Force Approach
A brute force solution would try to explore all possible ways to partition the string and then determine which valid partition uses the fewest substrings.
For a string of length n, there are 2^(n-1) possible partition points because every gap between characters can either contain a split or not. For each possible partitioning, we would validate whether every substring contains unique characters. If valid, we would count the number of substrings and keep track of the minimum.
This approach guarantees correctness because it examines every possible partition arrangement. However, it is computationally infeasible for n = 10^5. Even for moderately sized inputs, the exponential number of possibilities becomes impossible to process.
A slightly improved brute force version might use dynamic programming to try every ending position, but repeatedly checking uniqueness can still push complexity toward O(n²).
Optimal Greedy Approach
The key observation is that we should make each substring as large as possible before splitting.
Suppose we are building a substring and encounter a character that already exists in the current substring. At that exact moment, we have no choice but to split. Waiting longer is impossible because the current substring would already violate the uniqueness rule.
This naturally suggests a greedy strategy:
- Keep extending the current substring while characters remain unique.
- Maintain a set of characters already used in the current substring.
- If a duplicate character appears, start a new substring immediately.
- Reset tracking for the new substring.
This works because every split is forced. Splitting earlier would only create unnecessary extra substrings, while splitting later would violate the constraints.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n * n) |
O(n) |
Tries all possible partitions and validates each |
| Optimal Greedy | O(n) |
O(1) |
Greedily extends substrings until duplication occurs |
Algorithm Walkthrough
- Initialize a counter for the number of substrings.
Since at least one substring is required for any non-empty string, initialize the answer as 1.
2. Create a data structure to track characters in the current substring.
A hash set is a natural choice because it provides average O(1) insertion and lookup time.
3. Iterate through the string character by character.
For every character, check whether it already exists in the current substring. 4. If the character is not present in the set, add it.
This means the current substring still satisfies the uniqueness condition, so we continue expanding it. 5. If the character already exists, create a new substring.
Since duplicates are not allowed, we must split immediately before this character. Increase the substring count, clear the tracking structure, and start the new substring with the current character. 6. Continue until the entire string has been processed.
The substring counter will contain the minimum number of valid partitions.
Why it works
The algorithm relies on a greedy invariant: the current substring is always the longest valid substring we can build from the current position.
Whenever a duplicate character appears, splitting becomes mandatory because keeping the character would violate the uniqueness rule. Splitting earlier would never reduce the number of substrings, it would only increase them unnecessarily. Therefore, greedily maximizing each substring guarantees the minimum total number of partitions.
Python Solution
class Solution:
def partitionString(self, s: str) -> int:
partitions = 1
current_chars: set[str] = set()
for char in s:
if char in current_chars:
partitions += 1
current_chars.clear()
current_chars.add(char)
return partitions
The implementation directly follows the greedy algorithm.
We begin by initializing partitions = 1 because a non-empty string always requires at least one substring. We also create current_chars, a set that stores all characters currently present in the active substring.
As we iterate through the string, we check whether the current character already exists in current_chars.
If the character is already present, we have encountered a duplicate. Since duplicates are forbidden inside a substring, we increment the partition count and clear the set to begin tracking a fresh substring.
After handling the duplicate case, we add the current character into the set. This works correctly because even after a split, the current character belongs to the newly created substring.
At the end of the iteration, the partitions variable contains the minimum number of substrings required.
Go Solution
func partitionString(s string) int {
partitions := 1
currentChars := make(map[rune]bool)
for _, ch := range s {
if currentChars[ch] {
partitions++
currentChars = make(map[rune]bool)
}
currentChars[ch] = true
}
return partitions
}
The Go implementation follows the same greedy logic as the Python version.
Instead of a Python set, Go uses a map[rune]bool to track characters in the current substring. When a duplicate character is encountered, we increment the partition count and recreate the map to represent a fresh substring.
There are no concerns about integer overflow because the maximum answer is at most n = 10^5, which easily fits inside Go's int type. Since the input only contains lowercase English letters, iterating over runes works safely and correctly.
Worked Examples
Example 1
Input:
s = "abacaba"
| Index | Character | Current Set Before | Action | Partitions | Current Set After |
|---|---|---|---|---|---|
| 0 | a | {} |
Add a |
1 | {a} |
| 1 | b | {a} |
Add b |
1 | {a,b} |
| 2 | a | {a,b} |
Duplicate, split | 2 | {a} |
| 3 | c | {a} |
Add c |
2 | {a,c} |
| 4 | a | {a,c} |
Duplicate, split | 3 | {a} |
| 5 | b | {a} |
Add b |
3 | {a,b} |
| 6 | a | {a,b} |
Duplicate, split | 4 | {a} |
Final answer:
4
One optimal partition is:
("ab", "ac", "ab", "a")
Example 2
Input:
s = "ssssss"
| Index | Character | Current Set Before | Action | Partitions | Current Set After |
|---|---|---|---|---|---|
| 0 | s | {} |
Add s |
1 | {s} |
| 1 | s | {s} |
Duplicate, split | 2 | {s} |
| 2 | s | {s} |
Duplicate, split | 3 | {s} |
| 3 | s | {s} |
Duplicate, split | 4 | {s} |
| 4 | s | {s} |
Duplicate, split | 5 | {s} |
| 5 | s | {s} |
Duplicate, split | 6 | {s} |
Final answer:
6
Each character must form its own substring because every subsequent character repeats the previous one.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) |
Each character is processed exactly once |
| Space | O(1) |
At most 26 lowercase letters are stored |
The time complexity is linear because we scan the string exactly once and every set operation takes constant average time.
The space complexity is constant because the set can never contain more than 26 lowercase English letters. Even though a set is used, its maximum size is bounded by a constant.
Test Cases
solution = Solution()
assert solution.partitionString("abacaba") == 4 # provided example
assert solution.partitionString("ssssss") == 6 # repeated character only
assert solution.partitionString("a") == 1 # single character
assert solution.partitionString("abcdef") == 1 # all unique
assert solution.partitionString("aa") == 2 # smallest duplicate case
assert solution.partitionString("abac") == 2 # duplicate forces one split
assert solution.partitionString("abcabc") == 2 # repeated pattern
assert solution.partitionString("world") == 1 # already valid
assert solution.partitionString("abccba") == 2 # duplicate in middle
assert solution.partitionString("abcdefghijklmnopqrstuvwxyz") == 1 # all letters unique
assert solution.partitionString("abababab") == 4 # alternating duplicates
assert solution.partitionString("zzzzzzzzzz") == 10 # all identical
| Test | Why |
|---|---|
"abacaba" |
Validates provided example |
"ssssss" |
Tests repeated character case |
"a" |
Smallest valid input |
"abcdef" |
Entire string can remain one partition |
"aa" |
Immediate duplicate handling |
"abac" |
Duplicate appears after several characters |
"abcabc" |
Repeating pattern |
"world" |
No splits required |
"abccba" |
Duplicate in the middle |
"abcdefghijklmnopqrstuvwxyz" |
Maximum unique lowercase sequence |
"abababab" |
Frequent partitioning |
"zzzzzzzzzz" |
Worst case for partition count |
Edge Cases
Single Character String
A string like "a" is the smallest possible valid input. A buggy implementation might accidentally initialize the partition count to 0 and forget that at least one substring always exists. This implementation starts with partitions = 1, ensuring the correct answer is returned.
All Characters Unique
A string like "abcdef" never requires a split. Some implementations incorrectly create unnecessary partitions when checking duplicates. Since no character repeats, the set continues growing and the partition count remains 1.
All Characters Repeated
A string like "ssssss" represents the opposite extreme. Every character after the first immediately forces a split. A common mistake is forgetting to reinsert the duplicate character after clearing the set. This implementation correctly clears the set and then adds the current character into the newly formed substring.
Duplicate Appears After Long Prefix
Consider "abcdefghijklmnopqrstuvwxyza". The final a duplicates a character seen much earlier. A naive implementation that only checks recent characters would fail here. Since the algorithm maintains a complete set of characters in the current substring, it correctly detects the duplicate regardless of position.