LeetCode 1180 - Count Substrings with Only One Distinct Letter
The problem asks us to count how many substrings of a given string contain only one distinct character. A substring is a contiguous section of the string. For example, in the string "aaaba", the substring "aaa" is valid because every character is 'a'.
Difficulty: 🟢 Easy
Topics: Math, String
Solution
Problem Understanding
The problem asks us to count how many substrings of a given string contain only one distinct character.
A substring is a contiguous section of the string. For example, in the string "aaaba", the substring "aaa" is valid because every character is 'a'. The substring "aab" is not valid because it contains both 'a' and 'b'.
The input is a single string s consisting only of lowercase English letters. The length of the string is between 1 and 1000.
The expected output is an integer representing the total number of substrings where all characters are identical.
The key observation is that consecutive groups of the same character generate many valid substrings. For example:
-
"a"contributes1 -
"aa"contributes3because the valid substrings are: -
"a"at index 0 -
"a"at index 1 -
"aa" -
"aaa"contributes6
In general, a consecutive block of length n contributes:
$1+2+3+\cdots+n=\frac{n(n+1)}{2}$
This problem is small enough that even quadratic solutions could work because n <= 1000, but the structure of the string allows a simple linear-time solution.
Some important edge cases include:
- A string with only one character, such as
"a" - A string where all characters are the same, such as
"aaaaa" - A string where every character is different, such as
"abcde" - Alternating characters, such as
"ababab"
These cases can expose incorrect counting logic or off-by-one errors.
Approaches
Brute Force Approach
The brute force solution considers every possible substring and checks whether it contains only one distinct character.
There are O(n^2) possible substrings in a string of length n. For each substring, we can scan its characters or place them into a set to determine whether all characters are identical.
For example, for "aaaba":
"a"is valid"aa"is valid"aaa"is valid"aaab"is invalid because it contains'a'and'b'
This approach is correct because it explicitly verifies every substring. However, it becomes inefficient because checking every substring requires repeated work.
If substring validation takes O(n) time, the overall complexity becomes O(n^3).
Optimal Approach
The key insight is that valid substrings come entirely from contiguous runs of identical characters.
For example:
-
"aaa"contains: -
3 substrings of length 1
-
2 substrings of length 2
-
1 substring of length 3
Total:
$\frac{3(3+1)}{2}=6$
Instead of examining every substring individually, we process the string once and identify consecutive character groups.
For each group of length k, we add:
$\frac{k(k+1)}{2}$
This avoids repeated work and produces a linear-time solution.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n³) | O(1) or O(n) | Enumerates all substrings and checks each one |
| Optimal | O(n) | O(1) | Counts contributions from consecutive character runs |
Algorithm Walkthrough
- Initialize a variable
total_substringsto store the final answer. - Initialize a variable
current_run_lengthto1because the first character always starts a run of length one. - Traverse the string starting from index
1. - For each character, compare it with the previous character.
- If they are the same, extend the current run by increasing
current_run_length. - If they are different, the previous run has ended.
- When a run ends, compute how many valid substrings that run contributes using:
$\frac{k(k+1)}{2}$
where k is the run length.
- Add this value to the answer.
- Reset
current_run_lengthto1for the new character run. - After the loop finishes, process the final run because it has not yet been added to the answer.
- Return the accumulated total.
Why it works
Every valid substring must consist entirely of identical characters. Therefore, every valid substring belongs completely inside one contiguous run of equal characters.
A run of length k contains:
ksubstrings of length 1k - 1substrings of length 2k - 2substrings of length 3- ...
1substring of lengthk
The total number is:
$k+(k-1)+(k-2)+\cdots+1=\frac{k(k+1)}{2}$
Since the algorithm processes every run exactly once and counts all substrings inside that run, it counts every valid substring exactly once.
Python Solution
class Solution:
def countLetters(self, s: str) -> int:
total_substrings = 0
current_run_length = 1
for index in range(1, len(s)):
if s[index] == s[index - 1]:
current_run_length += 1
else:
total_substrings += (
current_run_length * (current_run_length + 1)
) // 2
current_run_length = 1
total_substrings += (
current_run_length * (current_run_length + 1)
) // 2
return total_substrings
The implementation begins by initializing the answer accumulator and the length of the current character run.
The loop starts from index 1 because each character is compared with the previous character. If the current character matches the previous one, the run continues and the run length increases.
When the characters differ, the current run ends. The algorithm computes the number of valid substrings contributed by that run using the arithmetic series formula and adds it to the total.
After processing the ended run, the algorithm resets the run length to 1 because the current character starts a new run.
Once the traversal finishes, the final run still needs to be processed. This is why the formula is applied one final time after the loop.
The method then returns the total count.
Go Solution
func countLetters(s string) int {
totalSubstrings := 0
currentRunLength := 1
for i := 1; i < len(s); i++ {
if s[i] == s[i-1] {
currentRunLength++
} else {
totalSubstrings += currentRunLength * (currentRunLength + 1) / 2
currentRunLength = 1
}
}
totalSubstrings += currentRunLength * (currentRunLength + 1) / 2
return totalSubstrings
}
The Go implementation follows the same logic as the Python version.
Since the input length is at most 1000, integer overflow is not a concern because the maximum answer is:
$\frac{1000\times1001}{2}=500500$
which easily fits inside Go's int type.
Go strings are byte indexed, and the input contains only lowercase English letters, so direct indexing with s[i] is completely safe.
Worked Examples
Example 1
Input:
s = "aaaba"
The runs are:
"aaa"with length 3"b"with length 1"a"with length 1
| Step | Character | Current Run Length | Action | Total |
|---|---|---|---|---|
| Start | a | 1 | Initialize | 0 |
| i = 1 | a | 2 | Same as previous | 0 |
| i = 2 | a | 3 | Same as previous | 0 |
| i = 3 | b | 3 | Run ended, add 6 | 6 |
| Reset | b | 1 | Start new run | 6 |
| i = 4 | a | 1 | Run ended, add 1 | 7 |
| Reset | a | 1 | Start new run | 7 |
| End | a | 1 | Add final run | 8 |
Final answer:
8
Example 2
Input:
s = "aaaaaaaaaa"
There is only one run:
"aaaaaaaaaa"with length 10
Contribution:
$\frac{10(10+1)}{2}=55$
| Step | Character | Current Run Length | Total |
|---|---|---|---|
| Start | a | 1 | 0 |
| Continue | a | 2 | 0 |
| Continue | a | 3 | 0 |
| Continue | a | 4 | 0 |
| Continue | a | 5 | 0 |
| Continue | a | 6 | 0 |
| Continue | a | 7 | 0 |
| Continue | a | 8 | 0 |
| Continue | a | 9 | 0 |
| Continue | a | 10 | 0 |
| End | Finalize | 10 | 55 |
Final answer:
55
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | The string is traversed exactly once |
| Space | O(1) | Only a few integer variables are used |
The algorithm performs a single pass through the string, and every operation inside the loop takes constant time. No auxiliary data structures proportional to the input size are used.
Test Cases
solution = Solution()
assert solution.countLetters("aaaba") == 8 # Example case
assert solution.countLetters("aaaaaaaaaa") == 55 # All identical characters
assert solution.countLetters("a") == 1 # Single character
assert solution.countLetters("ab") == 2 # Two different characters
assert solution.countLetters("aa") == 3 # Simple repeated run
assert solution.countLetters("abc") == 3 # All unique characters
assert solution.countLetters("aba") == 3 # Alternating pattern
assert solution.countLetters("zzzzz") == 15 # Larger repeated run
assert solution.countLetters("aabbbcc") == 12 # Multiple runs
assert solution.countLetters("abababab") == 8 # No repeated adjacent chars
| Test | Why |
|---|---|
"aaaba" |
Validates the provided example |
"aaaaaaaaaa" |
Tests one large continuous run |
"a" |
Smallest possible input |
"ab" |
Ensures different adjacent characters reset correctly |
"aa" |
Verifies basic repeated run counting |
"abc" |
Every character forms only length-1 substrings |
"aba" |
Confirms separated identical characters are not merged |
"zzzzz" |
Tests arithmetic formula on larger runs |
"aabbbcc" |
Verifies multiple independent runs |
"abababab" |
Ensures no accidental cross-run counting |
Edge Cases
A single-character string is the smallest valid input. It is easy to accidentally mishandle initialization and return 0 instead of 1. This implementation starts the run length at 1, ensuring the single character contributes correctly.
A string where every character is identical can expose incorrect run-finalization logic. For example, "aaaa" contains many overlapping substrings. Some implementations forget to process the final run after the loop. This solution explicitly adds the last run contribution after traversal completes.
Alternating characters such as "ababab" can reveal bugs where non-contiguous equal characters are incorrectly grouped together. The algorithm only extends a run when adjacent characters are equal, so separated occurrences are treated independently.
Strings with multiple runs of varying sizes, such as "aabbbcc", test whether the algorithm properly resets the run length after each boundary. The implementation resets current_run_length to 1 immediately after processing a completed run, ensuring each group is counted independently.