LeetCode 1316 - Distinct Echo Substrings
The problem asks us to find the number of distinct non-empty substrings of a given string text that can be expressed as
Difficulty: 🔴 Hard
Topics: String, Trie, Rolling Hash, Hash Function
Solution
Problem Understanding
The problem asks us to find the number of distinct non-empty substrings of a given string text that can be expressed as the concatenation of some string with itself. In other words, we need to count all substrings of the form a + a, where a is any non-empty string. For example, "abcabc" can be represented as "abc" + "abc", making it an echo substring. Substrings like "aa" and "abab" also qualify if they satisfy this repetition property.
The input is a string text of length between 1 and 2000, containing only lowercase English letters. The output is an integer representing the count of distinct echo substrings.
Important points include recognizing that substrings may overlap and that only distinct substrings should be counted. Edge cases involve very short strings (length 1, which cannot have echo substrings) or strings with repeated single characters (like "aaaaa") where multiple overlapping echo substrings exist.
The constraints tell us that a brute-force solution with O(n^3) complexity is likely too slow, and we need a more optimized method, potentially using hashing or tries to detect repeated patterns efficiently.
Approaches
The brute-force approach involves generating all possible substrings of text and checking for each whether it can be split into two equal halves. This works by iterating over every possible starting index i and every even length l (since only even-length strings can form a + a). For each candidate substring, we check if the first half equals the second half. While correct, this approach has a time complexity of O(n^3) because generating all substrings is O(n^2) and comparing halves is O(n), making it infeasible for n = 2000.
The key insight for an optimal approach is to use rolling hash or trie-based techniques to compare substrings efficiently. Rolling hash allows constant-time substring comparison after preprocessing the hash values of all prefixes. By iterating over possible lengths and using hashing, we can check whether the two halves of a substring are equal in O(1) time per substring. This reduces the time complexity significantly.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^3) | O(n^2) | Generate all substrings and compare halves directly |
| Rolling Hash / Trie | O(n^2) | O(n^2) | Use hashing to compare substrings efficiently |
Algorithm Walkthrough
- Precompute the rolling hash of all prefixes of
text. Choose a base (e.g., 26) and a large prime modulus to avoid collisions. - Initialize an empty set
seento store distinct echo substrings. - Iterate over all possible lengths
lengthof the echo substring. Since an echo substring isa + a, only consider even lengths. - For each starting index
i, calculate the hash of the first half (itoi + length // 2) and the second half (i + length // 2toi + length). - If the two halves have equal hash values, add the substring
text[i:i + length]toseen. - Continue until all even lengths and starting indices have been processed.
- Return the size of the
seenset, which gives the number of distinct echo substrings.
Why it works: The rolling hash ensures that substring equality checks are efficient and collision-resistant with a proper choice of base and modulus. By iterating over all starting indices and lengths, we guarantee that all potential echo substrings are considered. Using a set ensures that duplicates are counted only once.
Python Solution
class Solution:
def distinctEchoSubstrings(self, text: str) -> int:
n = len(text)
base = 26
mod = 10**9 + 7
# Precompute prefix hashes and powers of base
prefix_hash = [0] * (n + 1)
power = [1] * (n + 1)
for i in range(n):
prefix_hash[i + 1] = (prefix_hash[i] * base + ord(text[i]) - ord('a')) % mod
power[i + 1] = (power[i] * base) % mod
def get_hash(l, r):
return (prefix_hash[r] - prefix_hash[l] * power[r - l]) % mod
seen = set()
for length in range(2, n + 1, 2):
for i in range(n - length + 1):
mid = i + length // 2
if get_hash(i, mid) == get_hash(mid, i + length):
seen.add(text[i:i + length])
return len(seen)
Implementation Explanation: First, the code computes rolling hash values for all prefixes and precomputes powers of the base to enable O(1) substring hash queries. For each possible even length, it iterates over all starting positions, calculates the hashes of the two halves, and if they match, adds the substring to the set seen. Finally, it returns the number of distinct echo substrings.
Go Solution
func distinctEchoSubstrings(text string) int {
n := len(text)
base := 26
mod := int(1e9 + 7)
prefixHash := make([]int, n+1)
power := make([]int, n+1)
power[0] = 1
for i := 0; i < n; i++ {
prefixHash[i+1] = (prefixHash[i]*base + int(text[i]-'a')) % mod
power[i+1] = (power[i] * base) % mod
}
getHash := func(l, r int) int {
return (prefixHash[r] - prefixHash[l]*power[r-l]%mod + mod) % mod
}
seen := make(map[string]struct{})
for length := 2; length <= n; length += 2 {
for i := 0; i <= n-length; i++ {
mid := i + length/2
if getHash(i, mid) == getHash(mid, i+length) {
seen[text[i:i+length]] = struct{}{}
}
}
}
return len(seen)
}
Go Notes: The Go version mirrors the Python solution. Maps are used instead of sets, and modulo arithmetic ensures positive values. Slices are used to extract substrings when storing in the map.
Worked Examples
Example 1: "abcabcabc"
| i | length | first half | second half | Equal? | seen set |
|---|---|---|---|---|---|
| 0 | 6 | "abc" | "abc" | Yes | {"abcabc"} |
| 1 | 6 | "bca" | "bca" | Yes | {"abcabc","bcabca"} |
| 2 | 6 | "cab" | "cab" | Yes | {"abcabc","bcabca","cabcab"} |
| 3 | 6 | "abc" | "abc" | Yes | already in set |
| ... |
Output: 3
Example 2: "leetcodeleetcode"
| i | length | first half | second half | Equal? | seen set |
|---|---|---|---|---|---|
| 1 | 2 | "ee" | "ee" | Yes | {"ee"} |
| 0 | 16 | "leetcode" | "leetcode" | Yes | {"ee","leetcodeleetcode"} |
Output: 2
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n^2) | Iterate over O(n) starting positions for O(n) lengths, with O(1) hash comparisons |
| Space | O(n^2) | Storing prefix hashes O(n), set/map may contain up to O(n^2) substrings |
The O(n^2) time is feasible for n <= 2000. Rolling hash ensures constant-time substring equality checks, avoiding O(n) comparisons.
Test Cases
# Provided examples
assert Solution().distinctEchoSubstrings("abcabcabc") == 3 # multiple overlapping substrings
assert Solution().distinctEchoSubstrings("leetcodeleetcode") == 2 # includes long substring
# Edge cases
assert Solution().distinctEchoSubstrings("a") == 0 # length 1, no echo substrings
assert Solution().distinctEchoSubstrings("aa") == 1 # simplest echo substring
assert Solution().distinctEchoSubstrings("aaaa") == 3 # "aa", "aaaa" overlapping
# Random cases
assert Solution().distinctEchoSubstrings("abcd") == 0 # no repeated halves
assert Solution().distinctEchoSubstrings("ababab") == 2 # "abab", "ababab"
| Test | Why |
|---|---|
| "abcabcabc" | Multiple overlapping echo substrings |
| "leetcodeleetcode" | Includes long substring |
| "a" | Too short, should return 0 |
| "aa" | Minimal valid echo substring |
| "aaaa" | Multiple overlapping repeats |
| "abcd" | No echo substring present |
| "ababab" | Repeating pattern of odd length |