LeetCode 1078 - Occurrences After Bigram
The problem gives us a string called text and two target words, first and second. We must find every occurrence where the words appear consecutively in the exact order: For every such occurrence, we return the value of third.
Difficulty: 🟢 Easy
Topics: String
Solution
Problem Understanding
The problem gives us a string called text and two target words, first and second. We must find every occurrence where the words appear consecutively in the exact order:
first second third
For every such occurrence, we return the value of third.
The important detail is that the words must appear immediately next to each other in sequence. We are not searching for words that appear anywhere later in the string. The pattern must be formed by three consecutive words.
For example, consider:
"alice is a good girl she is a good student"
with:
first = "a"
second = "good"
The valid triples are:
"a good girl"
"a good student"
The corresponding third words are:
["girl", "student"]
The input text is guaranteed to contain only lowercase English letters and single spaces between words. There are no leading or trailing spaces, which means we can safely split the string by spaces to obtain a clean list of words.
The constraints are small:
text.length <= 1000- Each word length is at most
10
Because the input size is small, even straightforward linear scans are efficient enough. There is no need for advanced string matching algorithms such as KMP or trie-based searching.
Several edge cases are important:
- The text may contain fewer than three words, which means no valid triple can exist.
- Multiple matches may overlap.
- The same word may appear many times.
- The result can be empty if the pattern never appears.
- The third word can itself equal
firstorsecond.
A correct solution must carefully scan consecutive triples of words and collect every valid third word.
Approaches
Brute Force Approach
The brute force idea is to examine every possible sequence of three consecutive words in the text.
First, split the text into an array of words. Then, for every index i, check whether:
words[i] == first
words[i + 1] == second
If both conditions are true, append:
words[i + 2]
to the answer list.
This approach is correct because every valid occurrence of the pattern must appear as three consecutive words somewhere in the array. By checking all possible triples, we guarantee that no valid occurrence is missed.
Although this is already efficient enough for the problem constraints, we can still describe it as the baseline brute force solution because it explicitly checks every possible triple independently.
Optimal Approach
The optimal approach is effectively the same linear scan, but viewed more carefully as a sliding window over the words.
The key observation is that we only ever need to inspect three consecutive words at a time. Since the words are already separated by spaces, we can process the text in a single pass after splitting it.
Instead of repeatedly constructing substrings or performing expensive searches, we simply move through the array once and compare adjacent words directly.
This gives us a clean linear-time solution with minimal overhead.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n) | O(n) | Checks every triple of consecutive words |
| Optimal | O(n) | O(n) | Single linear scan using consecutive word comparison |
Here, n represents the number of words in the text.
Algorithm Walkthrough
- Split the input string
textinto an array calledwordsusing spaces as separators.
This transforms the problem from string processing into array processing. Since the problem guarantees single spaces and no leading or trailing spaces, the split operation produces a clean list of words. 2. Create an empty result list.
This list will store every valid third word we discover.
3. Iterate through the array from index 0 to len(words) - 3.
We stop at len(words) - 3 because each check requires access to:
words[i]words[i + 1]words[i + 2]
- For each index
i, compare:
words[i]withfirstwords[i + 1]withsecond
These two comparisons determine whether the current position starts a valid pattern.
5. If both words match, append words[i + 2] to the result list.
The third word immediately following the pair is exactly what the problem asks us to collect. 6. Continue scanning until all valid triples have been checked. 7. Return the result list.
Why it works
The algorithm works because every valid occurrence of the pattern "first second third" must occupy three consecutive positions in the word array. By scanning all consecutive triples exactly once, we guarantee that every valid third word is collected and no invalid word is included.
Python Solution
from typing import List
class Solution:
def findOcurrences(self, text: str, first: str, second: str) -> List[str]:
words = text.split()
result = []
for i in range(len(words) - 2):
if words[i] == first and words[i + 1] == second:
result.append(words[i + 2])
return result
The implementation begins by splitting the input text into a list of individual words. This allows direct indexed access to consecutive words without complicated string manipulation.
The result list stores all matching third words.
The loop runs until len(words) - 2 because we must safely access the next two positions for every index. Inside the loop, the algorithm checks whether the current word equals first and the next word equals second.
Whenever the pair matches, the algorithm appends the immediately following word to the result list.
Finally, the method returns the completed result list.
Go Solution
package main
import "strings"
func findOcurrences(text string, first string, second string) []string {
words := strings.Split(text, " ")
result := []string{}
for i := 0; i < len(words)-2; i++ {
if words[i] == first && words[i+1] == second {
result = append(result, words[i+2])
}
}
return result
}
The Go solution follows the same logic as the Python implementation. The text is split into words using strings.Split, and the algorithm scans every consecutive triple.
One Go-specific detail is slice handling. The result slice is initialized as an empty slice:
result := []string{}
Appending elements dynamically grows the slice as needed.
There are no integer overflow concerns because the input size is very small.
Worked Examples
Example 1
Input:
text = "alice is a good girl she is a good student"
first = "a"
second = "good"
After splitting:
| Index | Word |
|---|---|
| 0 | alice |
| 1 | is |
| 2 | a |
| 3 | good |
| 4 | girl |
| 5 | she |
| 6 | is |
| 7 | a |
| 8 | good |
| 9 | student |
Now scan consecutive triples.
| i | words[i] | words[i+1] | words[i+2] | Match? | Result |
|---|---|---|---|---|---|
| 0 | alice | is | a | No | [] |
| 1 | is | a | good | No | [] |
| 2 | a | good | girl | Yes | ["girl"] |
| 3 | good | girl | she | No | ["girl"] |
| 4 | girl | she | is | No | ["girl"] |
| 5 | she | is | a | No | ["girl"] |
| 6 | is | a | good | No | ["girl"] |
| 7 | a | good | student | Yes | ["girl", "student"] |
Final output:
["girl", "student"]
Example 2
Input:
text = "we will we will rock you"
first = "we"
second = "will"
After splitting:
| Index | Word |
|---|---|
| 0 | we |
| 1 | will |
| 2 | we |
| 3 | will |
| 4 | rock |
| 5 | you |
Scan consecutive triples.
| i | words[i] | words[i+1] | words[i+2] | Match? | Result |
|---|---|---|---|---|---|
| 0 | we | will | we | Yes | ["we"] |
| 1 | will | we | will | No | ["we"] |
| 2 | we | will | rock | Yes | ["we", "rock"] |
| 3 | will | rock | you | No | ["we", "rock"] |
Final output:
["we", "rock"]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each word is processed once in a single linear scan |
| Space | O(n) | The split word array and output list require additional storage |
The algorithm performs one pass through the list of words, making the runtime linear in the number of words. The space complexity is also linear because splitting the text creates a separate array of words.
Test Cases
from typing import List
class Solution:
def findOcurrences(self, text: str, first: str, second: str) -> List[str]:
words = text.split()
result = []
for i in range(len(words) - 2):
if words[i] == first and words[i + 1] == second:
result.append(words[i + 2])
return result
sol = Solution()
# Provided example 1
assert sol.findOcurrences(
"alice is a good girl she is a good student",
"a",
"good"
) == ["girl", "student"]
# Provided example 2
assert sol.findOcurrences(
"we will we will rock you",
"we",
"will"
) == ["we", "rock"]
# No matching pair
assert sol.findOcurrences(
"hello world example",
"a",
"b"
) == []
# Exactly three words with one match
assert sol.findOcurrences(
"a b c",
"a",
"b"
) == ["c"]
# Overlapping occurrences
assert sol.findOcurrences(
"a a a a",
"a",
"a"
) == ["a", "a"]
# Text shorter than three words
assert sol.findOcurrences(
"hello world",
"hello",
"world"
) == []
# Multiple separated matches
assert sol.findOcurrences(
"x y z x y a x y b",
"x",
"y"
) == ["z", "a", "b"]
# Match at the end of the text
assert sol.findOcurrences(
"one two three four",
"two",
"three"
) == ["four"]
# Repeated words throughout the text
assert sol.findOcurrences(
"go go stop go go wait",
"go",
"go"
) == ["stop", "wait"]
| Test | Why |
|---|---|
"alice is a good girl she is a good student" |
Validates multiple matches |
"we will we will rock you" |
Validates overlapping scanning behavior |
"hello world example" |
Ensures empty result when no match exists |
"a b c" |
Smallest valid input with one match |
"a a a a" |
Tests overlapping occurrences |
"hello world" |
Verifies handling of fewer than three words |
"x y z x y a x y b" |
Tests multiple non-adjacent matches |
"one two three four" |
Verifies match near the end |
"go go stop go go wait" |
Tests repeated words and multiple results |
Edge Cases
One important edge case occurs when the text contains fewer than three words. Since the required pattern contains exactly three consecutive words, no valid occurrence can exist. A naive implementation that blindly accesses words[i + 2] could produce an index error. The implementation avoids this by iterating only up to len(words) - 2.
Another important case involves overlapping matches. Consider:
"a a a a"
with:
first = "a"
second = "a"
There are two valid triples:
"a a a"
"a a a"
starting at different positions. Some implementations accidentally skip overlapping matches by advancing too aggressively after finding a match. This solution correctly checks every consecutive starting position.
A third edge case occurs when no matching pair exists anywhere in the text. The algorithm must return an empty list rather than None or an error. Since the result list is initialized as empty and only appended to on successful matches, the implementation naturally handles this case correctly.
Another subtle case appears when the third word is identical to either first or second. The algorithm does not assume uniqueness among words and simply records the third word whenever the first two positions match the target pair. This guarantees correct behavior even with repeated or identical words.