LeetCode 2955 - Number of Same-End Substrings
The problem asks us to compute the number of same-end substrings within specified subranges of a given string s. A substring is same-end if its first and last character are identical. We are given multiple queries in the form [li, ri], each representing a substring s[li..ri].
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Counting, Prefix Sum
Solution
Problem Understanding
The problem asks us to compute the number of same-end substrings within specified subranges of a given string s. A substring is same-end if its first and last character are identical. We are given multiple queries in the form [li, ri], each representing a substring s[li..ri]. For each query, we must determine how many non-empty substrings of that range are same-end.
The input string can be up to 30,000 characters long, and there can be up to 30,000 queries. This means a naive solution that examines every possible substring would be too slow, as the number of substrings in a length-n string is roughly n*(n+1)/2. We need a method that scales efficiently with both the string length and the number of queries.
Edge cases include substrings of length 1, where every single character is trivially same-end, and substrings where all characters are identical. The problem guarantees valid indices for all queries.
Approaches
Brute Force
The brute-force approach would iterate over each query [li, ri], generate every possible substring within s[li..ri], and count how many have the same first and last character. While this approach is correct, it is computationally expensive. For a substring of length n, there are n*(n+1)/2 substrings. With multiple queries on a large string, the time complexity can reach O(q * n^2), which is infeasible for n and q up to 30,000.
Optimal Approach
The key insight is that for a substring to be same-end, we only need to know the frequency of each character in the range and how many times that character can pair up with itself to form substrings. Specifically, for a character c, if it occurs count_c times in the range, the number of same-end substrings starting and ending with c is the sum of the first count_c positive integers: count_c * (count_c + 1) / 2.
Using prefix sums for character counts allows us to quickly calculate count_c for any query range in O(1) per character. By maintaining a 2D prefix sum array prefix[i][ch] representing the number of occurrences of character ch up to index i, we can answer each query in O(26) time (since there are 26 lowercase English letters), making the solution scalable.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(q * n^2) | O(1) | Check all substrings individually |
| Optimal | O(n + q * 26) | O(n * 26) | Use prefix sums to count occurrences efficiently |
Algorithm Walkthrough
- Initialize a 2D array
prefixof size(len(s) + 1) x 26, whereprefix[i][ch]stores the number of occurrences of characterchins[0..i-1]. - Iterate through the string
s. For each character at indexi, update the prefix sums for all characters.prefix[i+1][ch] = prefix[i][ch] + (1 if s[i] == ch else 0). - Initialize an empty list
ansto store the results for each query. - For each query
[li, ri], compute the count of each character ins[li..ri]ascount_c = prefix[ri+1][c] - prefix[li][c]. - For each character
c, addcount_c * (count_c + 1) // 2to the total for that query. - Append the total to
ans. - Return
ans.
Why it works: By using prefix sums, we can efficiently compute the number of times each character occurs in a given range. Counting combinations of start and end positions for each character gives the number of same-end substrings. This avoids iterating over every possible substring explicitly.
Python Solution
from typing import List
class Solution:
def sameEndSubstringCount(self, s: str, queries: List[List[int]]) -> List[int]:
n = len(s)
# Initialize prefix sum array
prefix = [[0] * 26 for _ in range(n + 1)]
for i in range(n):
for c in range(26):
prefix[i + 1][c] = prefix[i][c]
prefix[i + 1][ord(s[i]) - ord('a')] += 1
ans = []
for li, ri in queries:
total = 0
for c in range(26):
count_c = prefix[ri + 1][c] - prefix[li][c]
total += count_c * (count_c + 1) // 2
ans.append(total)
return ans
This Python implementation first builds the prefix sum table for all 26 letters. For each query, it computes the counts of each letter in the substring and calculates the number of same-end substrings using the combinatorial formula. This ensures that each query is answered in O(26) time.
Go Solution
func sameEndSubstringCount(s string, queries [][]int) []int {
n := len(s)
prefix := make([][]int, n+1)
for i := range prefix {
prefix[i] = make([]int, 26)
}
for i := 0; i < n; i++ {
for c := 0; c < 26; c++ {
prefix[i+1][c] = prefix[i][c]
}
prefix[i+1][s[i]-'a']++
}
ans := make([]int, len(queries))
for idx, q := range queries {
li, ri := q[0], q[1]
total := 0
for c := 0; c < 26; c++ {
countC := prefix[ri+1][c] - prefix[li][c]
total += countC * (countC + 1) / 2
}
ans[idx] = total
}
return ans
}
In Go, the main difference is explicit slice initialization. The prefix sum array is pre-allocated and updated similarly. Integer division in Go is safe here as all operations involve integers.
Worked Examples
Example 1: s = "abcaab", queries = [[0,0],[1,4],[2,5],[0,5]]
| Query | Substring | Character Counts | Same-End Count Calculation | Total |
|---|---|---|---|---|
| [0,0] | "a" | {'a':1} | 1*(1+1)/2 = 1 | 1 |
| [1,4] | "bcaa" | {'b':1,'c':1,'a':2} | 1,1,3 | 5 |
| [2,5] | "caab" | {'c':1,'a':2,'b':1} | 1,3,1 | 5 |
| [0,5] | "abcaab" | {'a':3,'b':2,'c':1} | 6,3,1 | 10 |
Example 2: s = "abcd", queries = [[0,3]]
| Query | Substring | Character Counts | Same-End Count Calculation | Total |
|---|---|---|---|---|
| [0,3] | "abcd" | {'a':1,'b':1,'c':1,'d':1} | 1+1+1+1 | 4 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + q * 26) | Building prefix sums takes O(n*26), each query O(26) |
| Space | O(n * 26) | Prefix sum array stores 26 integers per character position |
The approach efficiently balances precomputation with query evaluation, ensuring scalability to the problem limits.
Test Cases
# Basic examples
assert Solution().sameEndSubstringCount("abcaab", [[0,0],[1,4],[2,5],[0,5]]) == [1,5,5,10]
assert Solution().sameEndSubstringCount("abcd", [[0,3]]) == [4]
# Edge cases
assert Solution().sameEndSubstringCount("aa", [[0,0],[0,1]]) == [1,3] # single letters, double letters
assert Solution().sameEndSubstringCount("abc", [[0,2]]) == [3] # all distinct letters
assert Solution().sameEndSubstringCount("aaa", [[0,2]]) == [6] # all same letters
# Maximum input size stress test
s = "a" * 30000
queries = [[0, 29999]]
assert Solution().sameEndSubstringCount(s, queries) == [30000*30001//2]
| Test | Why |
|---|---|
| Basic examples | Validates problem statement examples |
| Single-letter and double-letter | Checks small ranges and trivial substrings |
| All distinct letters | Ensures counting works with no repeated letters |
| All same letters | Confirms combinatorial formula correctness |
| Maximum |