LeetCode 1065 - Index Pairs of a String
The problem gives us a string called text and a list of dictionary words called words. We must find every substring inside text that exactly matches one of the words in the dictionary.
Difficulty: 🟢 Easy
Topics: Array, String, Trie, Sorting
Solution
Problem Understanding
The problem gives us a string called text and a list of dictionary words called words. We must find every substring inside text that exactly matches one of the words in the dictionary.
For every valid match, we return the pair [i, j], where:
iis the starting index of the substringjis the ending index of the substring, inclusive
The output must contain all matching index pairs, sorted first by the starting index and then by the ending index.
For example, if:
text = "ababa"
words = ["aba", "ab"]
then:
"ab"appears from index0to1"aba"appears from index0to2"ab"appears again from index2to3"aba"appears again from index2to4
So the answer becomes:
[[0,1],[0,2],[2,3],[2,4]]
An important detail is that overlapping matches are allowed. The substring at [0,2] and the substring at [2,4] both use index 2, and both must be included.
The constraints are fairly small:
text.length <= 100words.length <= 20words[i].length <= 50
These limits tell us that even moderately inefficient solutions are acceptable. We do not need highly advanced optimization techniques like suffix automata or Aho-Corasick. A clean Trie-based solution or direct matching solution will work comfortably within limits.
There are several edge cases worth identifying early:
- Multiple matches can start at the same index.
- Matches may overlap.
- Some words may never appear in the text.
- Single-character words are valid.
- The entire text itself may be a valid word.
- Different words may share prefixes, which makes Trie structures especially useful.
Approaches
Brute Force Approach
The most straightforward solution is to check every possible substring of text.
For every starting index i, we can try every ending index j, extract the substring text[i:j+1], and check whether it exists in the dictionary.
To make lookup efficient, we place all words into a hash set.
The algorithm works because every possible substring is examined exactly once. If a substring belongs to words, we record its indices.
However, this approach repeatedly creates substrings, which can be inefficient. Even though the constraints are small enough for it to pass, it does unnecessary work because many substrings cannot possibly match any dictionary word.
Optimal Approach, Trie-Based Search
A better solution uses a Trie.
A Trie is useful because many words may share prefixes. Instead of generating every substring independently, we start from each index in text and walk character by character through the Trie.
At each step:
- If the current character does not exist in the Trie, we stop immediately.
- If we reach a Trie node marked as a complete word, we record a valid index pair.
This avoids exploring substrings that cannot possibly match any word.
The key insight is that the Trie lets us validate prefixes incrementally instead of building complete substrings repeatedly.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n³) | O(m) | Checks every substring and creates substring objects repeatedly |
| Optimal | O(n × L) | O(total characters in words) | Trie enables early stopping during matching |
Where:
nis the length oftextLis the maximum word lengthmis the number of words
Algorithm Walkthrough
Trie Construction
We first build a Trie from all words in the dictionary.
Each Trie node contains:
- A map of child characters
- A boolean flag indicating whether the path forms a complete word
For example, if the words are:
["ab", "aba"]
then the Trie stores:
a -> b (word)
-> a (word)
Search Process
We now iterate through every starting position in text.
Step-by-Step Algorithm
- Create an empty Trie root node.
- Insert every word into the Trie.
For each character:
- If the child node does not exist, create it.
- Move to the child node.
After the final character, mark the node as a complete word.
3. Initialize an empty result list.
4. For every starting index start in text:
- Begin traversal from the Trie root.
- Move forward character by character through
text.
- For every ending index
endstarting fromstart:
- Check whether
text[end]exists as a child in the Trie. - If it does not exist, stop searching from this starting index because no longer substring can match.
- Otherwise, move to the corresponding child node.
- If the current Trie node marks the end of a word:
- Add
[start, end]to the result list.
- Continue until all starting positions are processed.
- Return the result list.
Why it works
The Trie guarantees that every traversal corresponds to a valid prefix of at least one dictionary word. Whenever we reach a node marked as a complete word, the substring from start to end exactly matches a dictionary word.
Because we begin traversal from every starting index in text, every possible valid substring is considered. Because traversal stops immediately when a prefix becomes invalid, no unnecessary substrings are explored.
Therefore, the algorithm returns all valid index pairs exactly once and in sorted order.
Python Solution
from typing import List
class TrieNode:
def __init__(self):
self.children = {}
self.is_word = False
class Solution:
def indexPairs(self, text: str, words: List[str]) -> List[List[int]]:
root = TrieNode()
# Build Trie
for word in words:
node = root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_word = True
result = []
# Search from every starting index
for start in range(len(text)):
node = root
for end in range(start, len(text)):
char = text[end]
if char not in node.children:
break
node = node.children[char]
if node.is_word:
result.append([start, end])
return result
The implementation begins by defining a TrieNode class. Each node stores:
children, a dictionary mapping characters to child nodesis_word, which indicates whether the current path forms a complete word
The Trie is built by inserting every word character by character. Shared prefixes automatically reuse existing nodes.
After construction, the algorithm searches from every possible starting index in text.
For each starting index:
- Traversal begins from the Trie root.
- Characters are consumed one at a time.
- If the next character does not exist in the Trie, traversal stops immediately.
- If the current node represents a complete word, the corresponding index pair is added to the answer.
Because the outer loop processes starting indices in ascending order and the inner loop processes ending indices in ascending order, the output is naturally sorted as required.
Go Solution
package main
type TrieNode struct {
children map[byte]*TrieNode
isWord bool
}
func newTrieNode() *TrieNode {
return &TrieNode{
children: make(map[byte]*TrieNode),
}
}
func indexPairs(text string, words []string) [][]int {
root := newTrieNode()
// Build Trie
for _, word := range words {
node := root
for i := 0; i < len(word); i++ {
char := word[i]
if _, exists := node.children[char]; !exists {
node.children[char] = newTrieNode()
}
node = node.children[char]
}
node.isWord = true
}
result := [][]int{}
// Search from every starting index
for start := 0; start < len(text); start++ {
node := root
for end := start; end < len(text); end++ {
char := text[end]
nextNode, exists := node.children[char]
if !exists {
break
}
node = nextNode
if node.isWord {
result = append(result, []int{start, end})
}
}
}
return result
}
The Go implementation closely follows the Python version.
The Trie uses:
map[byte]*TrieNodefor child pointers- A boolean
isWordfield
Since the input contains only lowercase English letters, using byte is sufficient and efficient.
The result is stored as a slice of integer slices, [][]int.
Go slices naturally handle dynamic resizing during append operations, so no special memory management is needed.
Worked Examples
Example 1
text = "thestoryofleetcodeandme"
words = ["story","fleet","leetcode"]
Trie Structure
s -> t -> o -> r -> y
f -> l -> e -> e -> t
l -> e -> e -> t -> c -> o -> d -> e
Traversal
| Start Index | Traversal | Match Found |
|---|---|---|
| 0 | "t" not in Trie root | None |
| 1 | "h" not in Trie root | None |
| 2 | "e" not in Trie root | None |
| 3 | s -> t -> o -> r -> y | [3,7] |
| 9 | l -> e -> e -> t -> c -> o -> d -> e | [9,17] |
| 10 | e not in Trie root | None |
| 13 | f -> l -> e -> e -> t | [10,14] not possible because traversal started at 10 |
The valid output becomes:
[[3,7],[9,17]]
However, "fleet" also exists starting at index 9 inside "leetcode"? No. The substring beginning at index 9 is "leetcode...", so "fleet" does not match there.
The final output is:
[[3,7],[9,17]]
Actually, the official example output also includes:
[9,13]
because "fleet" appears starting at index 9 in "fleet" inside the text. Let us verify carefully:
thestoryofleetcodeandme
^
10
The substring "fleet" does not occur. The official example is:
[[3,7],[9,13],[10,17]]
This corresponds to:
"story"at[3,7]"fleet"at[9,13]"leetcode"at[10,17]
because "fleet" overlaps with "leetcode" in the sequence "fleetcode".
Example 2
text = "ababa"
words = ["aba","ab"]
Trie Structure
a -> b (word)
-> a (word)
Detailed Traversal
| Start | End | Current Substring | Trie Valid | Word Match | Result |
|---|---|---|---|---|---|
| 0 | 0 | "a" | Yes | No | [] |
| 0 | 1 | "ab" | Yes | Yes | [[0,1]] |
| 0 | 2 | "aba" | Yes | Yes | [[0,1],[0,2]] |
| 0 | 3 | "abab" | No | No | Stop |
| 1 | 1 | "b" | No | No | Stop |
| 2 | 2 | "a" | Yes | No | [] |
| 2 | 3 | "ab" | Yes | Yes | [[2,3]] |
| 2 | 4 | "aba" | Yes | Yes | [[2,3],[2,4]] |
Final result:
[[0,1],[0,2],[2,3],[2,4]]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n × L) | For each starting index, traversal continues at most the maximum word length |
| Space | O(total characters in words) | Trie stores every character from all words |
The Trie prevents unnecessary substring exploration. From each starting position, traversal stops immediately once the current substring is no longer a valid prefix.
If L is the maximum word length, then each starting position explores at most L characters before terminating.
The Trie size depends on the total number of characters inserted across all dictionary words.
Test Cases
from typing import List
class TrieNode:
def __init__(self):
self.children = {}
self.is_word = False
class Solution:
def indexPairs(self, text: str, words: List[str]) -> List[List[int]]:
root = TrieNode()
for word in words:
node = root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_word = True
result = []
for start in range(len(text)):
node = root
for end in range(start, len(text)):
char = text[end]
if char not in node.children:
break
node = node.children[char]
if node.is_word:
result.append([start, end])
return result
solution = Solution()
# Provided example with overlapping matches
assert solution.indexPairs(
"ababa",
["aba", "ab"]
) == [[0, 1], [0, 2], [2, 3], [2, 4]]
# Provided example with multiple dictionary words
assert solution.indexPairs(
"thestoryofleetcodeandme",
["story", "fleet", "leetcode"]
) == [[3, 7], [9, 13], [10, 17]]
# Single character match
assert solution.indexPairs(
"abc",
["a"]
) == [[0, 0]]
# Entire text is one word
assert solution.indexPairs(
"hello",
["hello"]
) == [[0, 4]]
# No matches
assert solution.indexPairs(
"abc",
["xyz"]
) == []
# Multiple overlapping occurrences
assert solution.indexPairs(
"aaaa",
["a", "aa"]
) == [
[0, 0], [0, 1],
[1, 1], [1, 2],
[2, 2], [2, 3],
[3, 3]
]
# Shared prefixes in Trie
assert solution.indexPairs(
"abcdef",
["ab", "abc", "abcd"]
) == [[0, 1], [0, 2], [0, 3]]
# Word appears multiple times
assert solution.indexPairs(
"catcat",
["cat"]
) == [[0, 2], [3, 5]]
# Match at end of string
assert solution.indexPairs(
"xyzabc",
["abc"]
) == [[3, 5]]
| Test | Why |
|---|---|
"ababa" with overlapping words |
Validates overlap handling |
"thestoryofleetcodeandme" |
Validates multiple matches and sorting |
| Single-character word | Tests smallest word size |
| Entire text as word | Tests full-length match |
| No matches | Ensures empty result handling |
"aaaa" with repeated patterns |
Stresses overlapping repeated matches |
| Shared prefixes | Verifies Trie prefix reuse |
| Repeated word occurrence | Ensures multiple appearances are found |
| Match at end of string | Tests boundary handling |
Edge Cases
Overlapping Matches
One of the most important edge cases is overlapping substrings.
For example:
text = "ababa"
words = ["aba"]
The word "aba" appears at both [0,2] and [2,4].
A buggy implementation might skip the second occurrence after finding the first one. Our algorithm avoids this problem because it independently starts a Trie traversal from every index in the text.
Shared Prefixes
Words may share long prefixes:
words = ["ab", "abc", "abcd"]
A naive implementation might repeatedly reprocess the same prefix characters.
The Trie handles this naturally because all words share the same prefix path. During traversal, every node can simultaneously represent:
- An intermediate prefix
- A complete word
- The beginning of a longer word
This allows all matches to be discovered efficiently.
Early Mismatch Termination
Consider:
text = "xyzabc"
words = ["abc"]
At indices 0, 1, and 2, traversal immediately fails because those characters do not exist at the Trie root.
Without early termination, an implementation might continue checking unnecessary substrings. The Trie-based approach stops immediately once a prefix becomes invalid, which keeps the algorithm efficient.