LeetCode 3084 - Count Substrings Starting and Ending with Given Character

The problem gives us a string s and a character c. We must count how many substrings of s both start and end with the character c. A substring is any contiguous portion of the string. For every possible substring, we check two conditions: 1. The first character must equal c 2.

LeetCode Problem 3084

Difficulty: 🟔 Medium
Topics: Math, String, Counting

Solution

LeetCode 3084 - Count Substrings Starting and Ending with Given Character

Problem Understanding

The problem gives us a string s and a character c. We must count how many substrings of s both start and end with the character c.

A substring is any contiguous portion of the string. For every possible substring, we check two conditions:

  1. The first character must equal c
  2. The last character must also equal c

If both conditions are satisfied, that substring contributes to the answer.

For example, if s = "abada" and c = "a", the positions containing 'a' are:

  • index 0
  • index 2
  • index 4

Any pair of these positions can define a valid substring:

  • (0,0) → "a"
  • (0,2) → "aba"
  • (0,4) → "abada"
  • (2,2) → "a"
  • (2,4) → "ada"
  • (4,4) → "a"

That gives a total of 6.

The constraints are important:

  • 1 <= s.length <= 10^5
  • only lowercase English letters are used

Since the string length can be as large as 100000, an algorithm that checks every substring individually would be too slow. There are O(n^2) substrings in total, which becomes impractical at this scale.

Several edge cases are worth noticing immediately:

  • The character c may appear only once
  • The character c may appear in every position
  • The string may have no repeated occurrences of c
  • The entire answer may fit only because we count single-character substrings as valid

The problem guarantees that c is a lowercase English letter and the string is non-empty, so we never need to handle null or empty input.

Approaches

Brute Force Approach

The most direct solution is to generate every possible substring and check whether it starts and ends with c.

We can use two nested loops:

  • The outer loop chooses the starting index
  • The inner loop chooses the ending index

For each substring s[i:j+1], we check:

s[i] == c and s[j] == c

If both conditions are true, we increment the answer.

This approach is correct because every possible substring is examined exactly once. However, there are O(n^2) substrings in a string of length n, so the solution becomes too slow for n = 100000.

Optimal Approach

The key observation is that we do not need to examine substrings explicitly.

Suppose the character c appears k times in the string.

Every valid substring is determined by choosing:

  • one occurrence of c as the starting position
  • one occurrence of c as the ending position
  • with the ending position greater than or equal to the starting position

If there are k occurrences, then the number of ways to choose such pairs is:

$\frac{k(k+1)}{2}$

This works because:

  • each occurrence alone forms a valid substring
  • every pair of different occurrences also forms a valid substring

So the problem reduces to simply counting how many times c appears in the string.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) Checks every possible substring
Optimal O(n) O(1) Counts occurrences of c and applies a combinational formula

Algorithm Walkthrough

  1. Initialize a variable count to 0.

This variable stores how many times the character c appears in the string. 2. Traverse the string character by character.

For each character:

  • if it equals c, increment count
  1. After the traversal finishes, compute the total number of valid substrings.

If count = k, then:

  • every occurrence can pair with itself
  • every occurrence can pair with every later occurrence

The total number of valid pairs is:

$\frac{k(k+1)}{2}$ 4. Return the computed value.

Why it works

Every valid substring must begin and end at positions containing c. Therefore, each valid substring corresponds exactly to choosing two occurrences of c where the second occurrence is at or after the first.

If there are k occurrences, the number of such ordered choices is:

  • k single-position substrings
  • k-1 substrings starting at the first occurrence and ending later
  • k-2 substrings starting at the second occurrence and ending later
  • and so on

This sum equals:

$1+2+3+\cdots+k=\frac{k(k+1)}{2}$

Thus the formula always gives the exact number of valid substrings.

Python Solution

class Solution:
    def countSubstrings(self, s: str, c: str) -> int:
        count = 0
        
        for char in s:
            if char == c:
                count += 1
        
        return count * (count + 1) // 2

The implementation follows the optimal algorithm directly.

We first count how many times c appears in the string. The loop scans the string once, and each matching character increments count.

After counting all occurrences, we apply the mathematical formula:

count * (count + 1) // 2

Integer division // is used because the result is always an integer.

The implementation avoids generating substrings entirely, which keeps the runtime linear and the memory usage constant.

Go Solution

func countSubstrings(s string, c byte) int64 {
    var count int64 = 0

    for i := 0; i < len(s); i++ {
        if s[i] == c {
            count++
        }
    }

    return count * (count + 1) / 2
}

The Go solution is almost identical to the Python version.

One important difference is that the function returns int64. Since the maximum string length is 100000, the maximum answer is:

$\frac{100000\cdot100001}{2}$

which exceeds the safe range for smaller integer types on some systems. Using int64 guarantees correctness.

Go strings are indexed by bytes, and since the problem uses only lowercase English letters, byte indexing is perfectly safe here.

Worked Examples

Example 1

Input:
s = "abada"
c = "a"

The algorithm scans the string and counts occurrences of 'a'.

Index Character Matches c? Count
0 a Yes 1
1 b No 1
2 a Yes 2
3 d No 2
4 a Yes 3

After traversal:

count = 3

Apply the formula:

$\frac{3\cdot4}{2}=6$

Final answer:

6

The valid substrings are:

  • "a" at index 0
  • "aba"
  • "abada"
  • "a" at index 2
  • "ada"
  • "a" at index 4

Example 2

Input:
s = "zzz"
c = "z"

Traverse the string:

Index Character Matches c? Count
0 z Yes 1
1 z Yes 2
2 z Yes 3

Apply the formula:

$\frac{3\cdot4}{2}=6$

Final answer:

6

All substrings are valid because every character is 'z'.

Complexity Analysis

Measure Complexity Explanation
Time O(n) The string is traversed exactly once
Space O(1) Only a few integer variables are used

The algorithm performs a single pass through the string, so the runtime grows linearly with the input size. No additional data structures proportional to the input size are created, so the memory usage remains constant.

Test Cases

solution = Solution()

# Provided examples
assert solution.countSubstrings("abada", "a") == 6  # multiple separated matches
assert solution.countSubstrings("zzz", "z") == 6  # every substring is valid

# Single character string
assert solution.countSubstrings("a", "a") == 1  # smallest valid input

# Character appears once
assert solution.countSubstrings("abcde", "c") == 1  # only one valid substring

# Character appears at both ends
assert solution.countSubstrings("abcda", "a") == 3  # "a", "abcda", "a"

# Character does not repeat
assert solution.countSubstrings("leetcode", "l") == 1  # only the first character

# All characters identical
assert solution.countSubstrings("aaaa", "a") == 10  # n*(n+1)/2

# Character appears many times non-consecutively
assert solution.countSubstrings("axaxaxa", "a") == 10  # four occurrences

# Character absent except one position
assert solution.countSubstrings("bbbbabbbb", "a") == 1  # single occurrence

# Long repeated pattern
assert solution.countSubstrings("abababab", "a") == 10  # four occurrences

# Another repeated pattern
assert solution.countSubstrings("cccccc", "c") == 21  # six occurrences

Test Case Summary

Test Why
"abada", "a" Standard mixed-character example
"zzz", "z" Every substring is valid
"a", "a" Minimum input size
"abcde", "c" Single occurrence of target character
"abcda", "a" Matching characters at both ends
"leetcode", "l" Only one valid substring
"aaaa", "a" All characters identical
"axaxaxa", "a" Non-consecutive occurrences
"bbbbabbbb", "a" One isolated occurrence
"abababab", "a" Repeated alternating pattern
"cccccc", "c" Larger triangular-number result

Edge Cases

One important edge case occurs when the target character appears exactly once. In this situation, the only valid substring is the single-character substring containing that occurrence itself. A naive implementation might incorrectly expect at least two matching characters. The formula handles this naturally because:

$\frac{1\cdot2}{2}=1$

Another important edge case is when every character in the string equals c. In this case, every possible substring is valid. The total number of substrings in a string of length n is:

$\frac{n(n+1)}{2}$

Since the algorithm counts all occurrences and applies the same formula, it automatically produces the correct result.

A third important edge case is a very large input string near the maximum constraint size. Any quadratic solution would time out because the number of substrings becomes enormous. The optimal solution remains efficient because it performs only one pass through the string and stores only a single counter variable.