LeetCode 1358 - Number of Substrings Containing All Three Characters
The problem asks us to count all substrings of a given string s that contain at least one occurrence of each character '
Difficulty: 🟡 Medium
Topics: Hash Table, String, Sliding Window
Solution
Problem Understanding
The problem asks us to count all substrings of a given string s that contain at least one occurrence of each character 'a', 'b', and 'c'. The string s is guaranteed to contain only these three characters. Essentially, we need to find the number of contiguous sequences in s where none of 'a', 'b', or 'c' is missing.
The input is a string of length up to 50,000, which rules out any algorithm that checks all substrings individually, as the number of substrings grows quadratically. The expected output is a single integer representing the total count of valid substrings. Edge cases include strings that are just the minimum length of 3, strings that are missing one character entirely, or strings that have repeated blocks of characters.
Approaches
The brute-force approach would involve generating all possible substrings of s, then checking each substring for the presence of 'a', 'b', and 'c'. This guarantees correctness because every substring is checked, but it is highly inefficient. For a string of length n, there are n * (n + 1) / 2 substrings, and checking each substring takes up to O(n) time. This results in an overall complexity of O(n^3), which is impractical for n up to 50,000.
The optimal approach leverages a sliding window with two pointers (left and right) and a hash map to track counts of 'a', 'b', and 'c'. The key observation is that once a window contains at least one of each character, all substrings starting at left and ending at any position from right to the end of the string are valid. By moving left to the right while maintaining the counts, we can efficiently count all valid substrings without enumerating them explicitly. This reduces the time complexity to O(n).
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^3) | O(1) | Generate all substrings and check each for all three characters |
| Optimal | O(n) | O(1) | Sliding window with hash map for character counts, counting valid substrings efficiently |
Algorithm Walkthrough
-
Initialize a hash map
countwith keys'a','b', and'c'set to 0. This will track the number of each character in the current window. -
Set two pointers
leftandrightto 0.rightwill expand the window,leftwill shrink it. -
Initialize a variable
resultto 0 to store the total number of valid substrings. -
Iterate
rightover the string: -
Increment
count[s[right]]since the character atrightis now part of the window. -
While all three characters have counts greater than 0:
-
Increment
resultbylen(s) - right, because all substrings starting atleftand ending fromrightto the end are valid. -
Decrement
count[s[left]]and moveleftforward to try smaller windows. -
After iterating through the string,
resultcontains the total number of substrings containing all three characters.
Why it works: The invariant is that the current window always contains at least one of each character when we are counting. By counting len(s) - right substrings from each valid left position, we efficiently account for all valid substrings without redundancy.
Python Solution
class Solution:
def numberOfSubstrings(self, s: str) -> int:
count = {'a': 0, 'b': 0, 'c': 0}
left = 0
result = 0
for right in range(len(s)):
count[s[right]] += 1
while all(count[c] > 0 for c in 'abc'):
result += len(s) - right
count[s[left]] -= 1
left += 1
return result
This implementation maintains a sliding window that expands with right and contracts with left. The while loop ensures that as soon as the window contains all three characters, we count all possible substrings starting at left and then shrink the window to check for additional possibilities. The hash map efficiently tracks the counts of each character.
Go Solution
func numberOfSubstrings(s string) int {
count := map[byte]int{'a': 0, 'b': 0, 'c': 0}
left := 0
result := 0
for right := 0; right < len(s); right++ {
count[s[right]]++
for count['a'] > 0 && count['b'] > 0 && count['c'] > 0 {
result += len(s) - right
count[s[left]]--
left++
}
}
return result
}
In Go, we use a map[byte]int to track counts of 'a', 'b', and 'c'. The approach is identical to Python, but we use byte indexing for the string and a standard for loop. The for loop inside ensures we shrink the window while maintaining all characters.
Worked Examples
Example 1: s = "abcabc"
| Step | left | right | count | result |
|---|---|---|---|---|
| 1 | 0 | 0 | {'a':1,'b':0,'c':0} | 0 |
| 2 | 0 | 1 | {'a':1,'b':1,'c':0} | 0 |
| 3 | 0 | 2 | {'a':1,'b':1,'c':1} | 4 (len(s)-2) |
| 4 | 1 | 2 | {'a':0,'b':1,'c':1} | 4 |
| 5 | 1 | 3 | {'a':1,'b':1,'c':1} | 7 (4 + len(s)-3=4+3) |
| 6 | 2 | 3 | {'a':0,'b':1,'c':1} | 7 |
| 7 | 2 | 4 | {'a':0,'b':2,'c':1} | 7 |
| 8 | 2 | 5 | {'a':1,'b':2,'c':2} | 10 (7 + len(s)-5=7+3) |
Final result = 10
Example 2: s = "aaacb"
| Step | left | right | count | result |
|---|---|---|---|---|
| 1 | 0 | 0 | {'a':1,'b':0,'c':0} | 0 |
| 2 | 0 | 1 | {'a':2,'b':0,'c':0} | 0 |
| 3 | 0 | 2 | {'a':3,'b':0,'c':0} | 0 |
| 4 | 0 | 3 | {'a':3,'b':0,'c':1} | 0 |
| 5 | 0 | 4 | {'a':3,'b':1,'c':1} | 3 (len(s)-4=1 but after moving left twice) |
Final result = 3
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each character is visited at most twice, once by right and once by left. |
| Space | O(1) | Only three characters are tracked in a hash map, independent of input size. |
This approach is optimal because it processes the string in linear time and uses constant extra space.
Test Cases
# Provided examples
assert Solution().numberOfSubstrings("abcabc") == 10 # multiple overlapping substrings
assert Solution().numberOfSubstrings("aaacb") == 3 # substrings with repeated a's
assert Solution().numberOfSubstrings("abc") == 1 # minimal valid substring
# Boundary cases
assert Solution().numberOfSubstrings("aaa") == 0 # no 'b' or 'c'
assert Solution().numberOfSubstrings("abc"*1000) == 4995000 # stress test
# Edge scenarios
assert Solution().numberOfSubstrings("abccba") == 10 # characters appear in reverse
assert Solution().numberOfSubstrings("aabcbc") == 6 # overlapping valid windows
| Test | Why |
|---|---|
| "abcabc" | Checks multiple overlapping valid substrings |
| "aaacb" | Checks repeated characters and minimal substrings |
| "abc" | Minimal length string containing all three |
| "aaa" | Missing characters edge case |
| "abc"*1000 | Performance on large input |
| "abccba" | Valid substrings in reversed order |
| "aabcbc" | Overlapping windows with repeated characters |
Edge Cases
1. Missing characters: If s lacks one of `'a