LeetCode 3084 - Count Substrings Starting and Ending with Given Character
The problem gives us a string s and a character c. We must count how many substrings of s both start and end with the character c. A substring is any contiguous portion of the string. For every possible substring, we check two conditions: 1. The first character must equal c 2.
Difficulty: š” Medium
Topics: Math, String, Counting
Solution
LeetCode 3084 - Count Substrings Starting and Ending with Given Character
Problem Understanding
The problem gives us a string s and a character c. We must count how many substrings of s both start and end with the character c.
A substring is any contiguous portion of the string. For every possible substring, we check two conditions:
- The first character must equal
c - The last character must also equal
c
If both conditions are satisfied, that substring contributes to the answer.
For example, if s = "abada" and c = "a", the positions containing 'a' are:
- index
0 - index
2 - index
4
Any pair of these positions can define a valid substring:
(0,0)ā"a"(0,2)ā"aba"(0,4)ā"abada"(2,2)ā"a"(2,4)ā"ada"(4,4)ā"a"
That gives a total of 6.
The constraints are important:
1 <= s.length <= 10^5- only lowercase English letters are used
Since the string length can be as large as 100000, an algorithm that checks every substring individually would be too slow. There are O(n^2) substrings in total, which becomes impractical at this scale.
Several edge cases are worth noticing immediately:
- The character
cmay appear only once - The character
cmay appear in every position - The string may have no repeated occurrences of
c - The entire answer may fit only because we count single-character substrings as valid
The problem guarantees that c is a lowercase English letter and the string is non-empty, so we never need to handle null or empty input.
Approaches
Brute Force Approach
The most direct solution is to generate every possible substring and check whether it starts and ends with c.
We can use two nested loops:
- The outer loop chooses the starting index
- The inner loop chooses the ending index
For each substring s[i:j+1], we check:
s[i] == c and s[j] == c
If both conditions are true, we increment the answer.
This approach is correct because every possible substring is examined exactly once. However, there are O(n^2) substrings in a string of length n, so the solution becomes too slow for n = 100000.
Optimal Approach
The key observation is that we do not need to examine substrings explicitly.
Suppose the character c appears k times in the string.
Every valid substring is determined by choosing:
- one occurrence of
cas the starting position - one occurrence of
cas the ending position - with the ending position greater than or equal to the starting position
If there are k occurrences, then the number of ways to choose such pairs is:
$\frac{k(k+1)}{2}$
This works because:
- each occurrence alone forms a valid substring
- every pair of different occurrences also forms a valid substring
So the problem reduces to simply counting how many times c appears in the string.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Checks every possible substring |
| Optimal | O(n) | O(1) | Counts occurrences of c and applies a combinational formula |
Algorithm Walkthrough
- Initialize a variable
countto0.
This variable stores how many times the character c appears in the string.
2. Traverse the string character by character.
For each character:
- if it equals
c, incrementcount
- After the traversal finishes, compute the total number of valid substrings.
If count = k, then:
- every occurrence can pair with itself
- every occurrence can pair with every later occurrence
The total number of valid pairs is:
$\frac{k(k+1)}{2}$ 4. Return the computed value.
Why it works
Every valid substring must begin and end at positions containing c. Therefore, each valid substring corresponds exactly to choosing two occurrences of c where the second occurrence is at or after the first.
If there are k occurrences, the number of such ordered choices is:
ksingle-position substringsk-1substrings starting at the first occurrence and ending laterk-2substrings starting at the second occurrence and ending later- and so on
This sum equals:
$1+2+3+\cdots+k=\frac{k(k+1)}{2}$
Thus the formula always gives the exact number of valid substrings.
Python Solution
class Solution:
def countSubstrings(self, s: str, c: str) -> int:
count = 0
for char in s:
if char == c:
count += 1
return count * (count + 1) // 2
The implementation follows the optimal algorithm directly.
We first count how many times c appears in the string. The loop scans the string once, and each matching character increments count.
After counting all occurrences, we apply the mathematical formula:
count * (count + 1) // 2
Integer division // is used because the result is always an integer.
The implementation avoids generating substrings entirely, which keeps the runtime linear and the memory usage constant.
Go Solution
func countSubstrings(s string, c byte) int64 {
var count int64 = 0
for i := 0; i < len(s); i++ {
if s[i] == c {
count++
}
}
return count * (count + 1) / 2
}
The Go solution is almost identical to the Python version.
One important difference is that the function returns int64. Since the maximum string length is 100000, the maximum answer is:
$\frac{100000\cdot100001}{2}$
which exceeds the safe range for smaller integer types on some systems. Using int64 guarantees correctness.
Go strings are indexed by bytes, and since the problem uses only lowercase English letters, byte indexing is perfectly safe here.
Worked Examples
Example 1
Input:
s = "abada"
c = "a"
The algorithm scans the string and counts occurrences of 'a'.
| Index | Character | Matches c? |
Count |
|---|---|---|---|
| 0 | a | Yes | 1 |
| 1 | b | No | 1 |
| 2 | a | Yes | 2 |
| 3 | d | No | 2 |
| 4 | a | Yes | 3 |
After traversal:
count = 3
Apply the formula:
$\frac{3\cdot4}{2}=6$
Final answer:
6
The valid substrings are:
"a"at index 0"aba""abada""a"at index 2"ada""a"at index 4
Example 2
Input:
s = "zzz"
c = "z"
Traverse the string:
| Index | Character | Matches c? |
Count |
|---|---|---|---|
| 0 | z | Yes | 1 |
| 1 | z | Yes | 2 |
| 2 | z | Yes | 3 |
Apply the formula:
$\frac{3\cdot4}{2}=6$
Final answer:
6
All substrings are valid because every character is 'z'.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | The string is traversed exactly once |
| Space | O(1) | Only a few integer variables are used |
The algorithm performs a single pass through the string, so the runtime grows linearly with the input size. No additional data structures proportional to the input size are created, so the memory usage remains constant.
Test Cases
solution = Solution()
# Provided examples
assert solution.countSubstrings("abada", "a") == 6 # multiple separated matches
assert solution.countSubstrings("zzz", "z") == 6 # every substring is valid
# Single character string
assert solution.countSubstrings("a", "a") == 1 # smallest valid input
# Character appears once
assert solution.countSubstrings("abcde", "c") == 1 # only one valid substring
# Character appears at both ends
assert solution.countSubstrings("abcda", "a") == 3 # "a", "abcda", "a"
# Character does not repeat
assert solution.countSubstrings("leetcode", "l") == 1 # only the first character
# All characters identical
assert solution.countSubstrings("aaaa", "a") == 10 # n*(n+1)/2
# Character appears many times non-consecutively
assert solution.countSubstrings("axaxaxa", "a") == 10 # four occurrences
# Character absent except one position
assert solution.countSubstrings("bbbbabbbb", "a") == 1 # single occurrence
# Long repeated pattern
assert solution.countSubstrings("abababab", "a") == 10 # four occurrences
# Another repeated pattern
assert solution.countSubstrings("cccccc", "c") == 21 # six occurrences
Test Case Summary
| Test | Why |
|---|---|
"abada", "a" |
Standard mixed-character example |
"zzz", "z" |
Every substring is valid |
"a", "a" |
Minimum input size |
"abcde", "c" |
Single occurrence of target character |
"abcda", "a" |
Matching characters at both ends |
"leetcode", "l" |
Only one valid substring |
"aaaa", "a" |
All characters identical |
"axaxaxa", "a" |
Non-consecutive occurrences |
"bbbbabbbb", "a" |
One isolated occurrence |
"abababab", "a" |
Repeated alternating pattern |
"cccccc", "c" |
Larger triangular-number result |
Edge Cases
One important edge case occurs when the target character appears exactly once. In this situation, the only valid substring is the single-character substring containing that occurrence itself. A naive implementation might incorrectly expect at least two matching characters. The formula handles this naturally because:
$\frac{1\cdot2}{2}=1$
Another important edge case is when every character in the string equals c. In this case, every possible substring is valid. The total number of substrings in a string of length n is:
$\frac{n(n+1)}{2}$
Since the algorithm counts all occurrences and applies the same formula, it automatically produces the correct result.
A third important edge case is a very large input string near the maximum constraint size. Any quadratic solution would time out because the number of substrings becomes enormous. The optimal solution remains efficient because it performs only one pass through the string and stores only a single counter variable.