LeetCode 3045 - Count Prefix and Suffix Pairs II
The problem asks us to count the number of ordered pairs (i, j) in a list of strings words such that i < j and words[i] is both a prefix and a suffix of words[j].
Difficulty: 🔴 Hard
Topics: Array, String, Trie, Rolling Hash, String Matching, Hash Function
Solution
Problem Understanding
The problem asks us to count the number of ordered pairs (i, j) in a list of strings words such that i < j and words[i] is both a prefix and a suffix of words[j]. In other words, for a pair to be valid, the smaller-indexed string must appear at the beginning and the end of the larger-indexed string.
The input is a list of strings where each string contains only lowercase English letters. The expected output is an integer representing the total count of such valid pairs. The constraints are significant: up to 10^5 words, each with length up to 10^5, but the sum of all string lengths is limited to 5 * 10^5. This means a naive approach that checks every pair character by character would be too slow, because in the worst case it could require O(n^2 * m) operations, which is prohibitive.
Important edge cases include very short strings like single characters, strings that are identical, strings that are shorter than potential prefixes, and strings that appear multiple times. The problem guarantees that all strings are non-empty and contain only lowercase letters, which simplifies some handling, but careful attention is required to avoid redundant comparisons and inefficient substring operations.
Approaches
Brute Force Approach
The brute force approach would iterate over all pairs (i, j) where i < j, then check if words[i] is both a prefix and a suffix of words[j]. This is correct, because it directly implements the definition of isPrefixAndSuffix, but it is far too slow for the given constraints. If there are n words and the maximum string length is m, the worst-case complexity is O(n^2 * m), which is impractical when n is 10^5 and m is large.
Optimal Approach
The key insight is that we do not need to compare every character repeatedly. Instead, we can use a hashing technique, such as rolling hash, to efficiently compare prefixes and suffixes. By preprocessing each string to compute its hash for every prefix and suffix length, we can compare whether a string is a prefix and suffix of another in constant time, reducing the complexity significantly. Additionally, we can store the hashes of previous words in a map to count occurrences efficiently, avoiding repeated comparisons for identical strings.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^2 * m) | O(1) | Checks each pair character by character |
| Optimal | O(n * m) | O(n * m) | Uses rolling hash for constant-time prefix/suffix checks and hash maps to count efficiently |
Algorithm Walkthrough
- Initialize a hash map to store the counts of each string encountered so far.
- Initialize a counter
total_pairsto 0. - Iterate over the list
wordsby indexj. - For each
words[j], iterate over all previously seen wordswords[i](stored in the hash map). - For each candidate
words[i], check if its length is less than or equal to the length ofwords[j]. - Compute the rolling hash of
words[i]and compare it with the hash of the prefix and suffix ofwords[j]of the same length. - If both match, increment
total_pairsby the number of timeswords[i]appeared before. - After checking all candidates, increment the count of
words[j]in the hash map for future iterations. - Return
total_pairs.
Why it works:
The algorithm works because the hash map ensures we only compare previously seen words, maintaining the i < j requirement. The rolling hash ensures that checking whether a string is a prefix and suffix is done in constant time rather than iterating over each character, preserving correctness while improving efficiency.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def countPrefixSuffixPairs(self, words: List[str]) -> int:
MOD = 10**9 + 7
BASE = 31
def compute_hash(s: str) -> int:
h = 0
for c in s:
h = (h * BASE + ord(c) - ord('a') + 1) % MOD
return h
total_pairs = 0
seen = defaultdict(int)
for word in words:
for prev_word, count in seen.items():
if len(prev_word) > len(word):
continue
prefix_match = word.startswith(prev_word)
suffix_match = word.endswith(prev_word)
if prefix_match and suffix_match:
total_pairs += count
seen[word] += 1
return total_pairs
Implementation Explanation:
We maintain a hash map seen to count occurrences of each word. For each new word, we iterate through all previously seen words. We skip any candidate longer than the current word. Using Python's startswith and endswith ensures efficient prefix and suffix checks. We then accumulate valid pairs and update the hash map with the current word.
Go Solution
package main
func countPrefixSuffixPairs(words []string) int64 {
totalPairs := int64(0)
seen := make(map[string]int64)
for _, word := range words {
for prevWord, count := range seen {
if len(prevWord) > len(word) {
continue
}
prefixMatch := word[:len(prevWord)] == prevWord
suffixMatch := word[len(word)-len(prevWord):] == prevWord
if prefixMatch && suffixMatch {
totalPairs += count
}
}
seen[word]++
}
return totalPairs
}
Go-specific Notes:
In Go, slicing strings to check prefix and suffix is efficient. We use a map[string]int64 to count previous occurrences. Integer overflow is avoided by using int64 for totalPairs since the number of valid pairs could be large. No nil checks are needed because all strings are non-empty.
Worked Examples
Example 1: words = ["a","aba","ababa","aa"]
| j | word | seen | Valid pairs found | total_pairs |
|---|---|---|---|---|
| 0 | "a" | {} | None | 0 |
| 1 | "aba" | {"a":1} | "a" is prefix and suffix | 1 |
| 2 | "ababa" | {"a":1, "aba":1} | "a", "aba" valid | 3 |
| 3 | "aa" | {"a":1, "aba":1, "ababa":1} | "a" valid | 4 |
Answer: 4
Example 2: words = ["pa","papa","ma","mama"]
| j | word | seen | Valid pairs found | total_pairs |
|---|---|---|---|---|
| 0 | "pa" | {} | None | 0 |
| 1 | "papa" | {"pa":1} | "pa" valid | 1 |
| 2 | "ma" | {"pa":1, "papa":1} | None | 1 |
| 3 | "mama" | {"pa":1, "papa":1, "ma":1} | "ma" valid | 2 |
Answer: 2
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n * m) | Each word is checked against previous words, but prefix/suffix checks are O(1) per candidate if using hash or slice operations; total length of all words is 5 * 10^5. |
| Space | O(n * m) | Hash map stores each unique word; worst case all words are distinct. |
The algorithm is efficient because the sum of lengths of all words is bounded. Direct character comparison is avoided for repeated checks.
Test Cases
# Provided examples
assert Solution().countPrefixSuffixPairs(["a","aba","ababa","aa"]) == 4
assert Solution().countPrefixSuffixPairs(["pa","papa","ma","mama"]) == 2
assert Solution().countPrefixSuffixPairs(["abab","ab"]) == 0
# Edge cases
assert Solution().countPrefixSuffixPairs(["a"]) == 0 # single element, no pairs
assert Solution().countPrefixSuffixPairs(["abc","abc","abc"]) == 3 # repeated identical strings
assert Solution().countPrefixSuffixPairs(["a","aa","aaa","aaaa"]) == 6 # cumulative prefix/suffix
assert Solution().countPrefixSuffixPairs(["ab","ba","ab"]) == 0 # no valid prefix-suffix match
| Test | Why |
|---|---|
| ["a","aba","ababa","aa"] | Verifies basic counting with multiple matches |
| ["pa","papa","ma","mama"] | Verifies independent matches |
| ["abab","ab"] | Checks when prefix longer than word fails |
| ["a"] | Single element edge case |
| ["abc","abc","abc"] | Identical strings tested for counting multiple pairs |
| ["a","aa","aaa","aaaa"] | Cumulative prefix-suffix relationships |
| ["ab","ba","ab"] | Strings with no valid matches |
Edge Cases
- Single-word input: If
wordshas only