LeetCode 1078 - Occurrences After Bigram

The problem gives us a string called text and two target words, first and second. We must find every occurrence where the words appear consecutively in the exact order: For every such occurrence, we return the value of third.

LeetCode Problem 1078

Difficulty: 🟢 Easy
Topics: String

Solution

Problem Understanding

The problem gives us a string called text and two target words, first and second. We must find every occurrence where the words appear consecutively in the exact order:

first second third

For every such occurrence, we return the value of third.

The important detail is that the words must appear immediately next to each other in sequence. We are not searching for words that appear anywhere later in the string. The pattern must be formed by three consecutive words.

For example, consider:

"alice is a good girl she is a good student"

with:

first = "a"
second = "good"

The valid triples are:

"a good girl"
"a good student"

The corresponding third words are:

["girl", "student"]

The input text is guaranteed to contain only lowercase English letters and single spaces between words. There are no leading or trailing spaces, which means we can safely split the string by spaces to obtain a clean list of words.

The constraints are small:

  • text.length <= 1000
  • Each word length is at most 10

Because the input size is small, even straightforward linear scans are efficient enough. There is no need for advanced string matching algorithms such as KMP or trie-based searching.

Several edge cases are important:

  • The text may contain fewer than three words, which means no valid triple can exist.
  • Multiple matches may overlap.
  • The same word may appear many times.
  • The result can be empty if the pattern never appears.
  • The third word can itself equal first or second.

A correct solution must carefully scan consecutive triples of words and collect every valid third word.

Approaches

Brute Force Approach

The brute force idea is to examine every possible sequence of three consecutive words in the text.

First, split the text into an array of words. Then, for every index i, check whether:

words[i] == first
words[i + 1] == second

If both conditions are true, append:

words[i + 2]

to the answer list.

This approach is correct because every valid occurrence of the pattern must appear as three consecutive words somewhere in the array. By checking all possible triples, we guarantee that no valid occurrence is missed.

Although this is already efficient enough for the problem constraints, we can still describe it as the baseline brute force solution because it explicitly checks every possible triple independently.

Optimal Approach

The optimal approach is effectively the same linear scan, but viewed more carefully as a sliding window over the words.

The key observation is that we only ever need to inspect three consecutive words at a time. Since the words are already separated by spaces, we can process the text in a single pass after splitting it.

Instead of repeatedly constructing substrings or performing expensive searches, we simply move through the array once and compare adjacent words directly.

This gives us a clean linear-time solution with minimal overhead.

Approach Time Complexity Space Complexity Notes
Brute Force O(n) O(n) Checks every triple of consecutive words
Optimal O(n) O(n) Single linear scan using consecutive word comparison

Here, n represents the number of words in the text.

Algorithm Walkthrough

  1. Split the input string text into an array called words using spaces as separators.

This transforms the problem from string processing into array processing. Since the problem guarantees single spaces and no leading or trailing spaces, the split operation produces a clean list of words. 2. Create an empty result list.

This list will store every valid third word we discover. 3. Iterate through the array from index 0 to len(words) - 3.

We stop at len(words) - 3 because each check requires access to:

  • words[i]
  • words[i + 1]
  • words[i + 2]
  1. For each index i, compare:
  • words[i] with first
  • words[i + 1] with second

These two comparisons determine whether the current position starts a valid pattern. 5. If both words match, append words[i + 2] to the result list.

The third word immediately following the pair is exactly what the problem asks us to collect. 6. Continue scanning until all valid triples have been checked. 7. Return the result list.

Why it works

The algorithm works because every valid occurrence of the pattern "first second third" must occupy three consecutive positions in the word array. By scanning all consecutive triples exactly once, we guarantee that every valid third word is collected and no invalid word is included.

Python Solution

from typing import List

class Solution:
    def findOcurrences(self, text: str, first: str, second: str) -> List[str]:
        words = text.split()
        result = []

        for i in range(len(words) - 2):
            if words[i] == first and words[i + 1] == second:
                result.append(words[i + 2])

        return result

The implementation begins by splitting the input text into a list of individual words. This allows direct indexed access to consecutive words without complicated string manipulation.

The result list stores all matching third words.

The loop runs until len(words) - 2 because we must safely access the next two positions for every index. Inside the loop, the algorithm checks whether the current word equals first and the next word equals second.

Whenever the pair matches, the algorithm appends the immediately following word to the result list.

Finally, the method returns the completed result list.

Go Solution

package main

import "strings"

func findOcurrences(text string, first string, second string) []string {
	words := strings.Split(text, " ")
	result := []string{}

	for i := 0; i < len(words)-2; i++ {
		if words[i] == first && words[i+1] == second {
			result = append(result, words[i+2])
		}
	}

	return result
}

The Go solution follows the same logic as the Python implementation. The text is split into words using strings.Split, and the algorithm scans every consecutive triple.

One Go-specific detail is slice handling. The result slice is initialized as an empty slice:

result := []string{}

Appending elements dynamically grows the slice as needed.

There are no integer overflow concerns because the input size is very small.

Worked Examples

Example 1

Input:

text = "alice is a good girl she is a good student"
first = "a"
second = "good"

After splitting:

Index Word
0 alice
1 is
2 a
3 good
4 girl
5 she
6 is
7 a
8 good
9 student

Now scan consecutive triples.

i words[i] words[i+1] words[i+2] Match? Result
0 alice is a No []
1 is a good No []
2 a good girl Yes ["girl"]
3 good girl she No ["girl"]
4 girl she is No ["girl"]
5 she is a No ["girl"]
6 is a good No ["girl"]
7 a good student Yes ["girl", "student"]

Final output:

["girl", "student"]

Example 2

Input:

text = "we will we will rock you"
first = "we"
second = "will"

After splitting:

Index Word
0 we
1 will
2 we
3 will
4 rock
5 you

Scan consecutive triples.

i words[i] words[i+1] words[i+2] Match? Result
0 we will we Yes ["we"]
1 will we will No ["we"]
2 we will rock Yes ["we", "rock"]
3 will rock you No ["we", "rock"]

Final output:

["we", "rock"]

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each word is processed once in a single linear scan
Space O(n) The split word array and output list require additional storage

The algorithm performs one pass through the list of words, making the runtime linear in the number of words. The space complexity is also linear because splitting the text creates a separate array of words.

Test Cases

from typing import List

class Solution:
    def findOcurrences(self, text: str, first: str, second: str) -> List[str]:
        words = text.split()
        result = []

        for i in range(len(words) - 2):
            if words[i] == first and words[i + 1] == second:
                result.append(words[i + 2])

        return result

sol = Solution()

# Provided example 1
assert sol.findOcurrences(
    "alice is a good girl she is a good student",
    "a",
    "good"
) == ["girl", "student"]

# Provided example 2
assert sol.findOcurrences(
    "we will we will rock you",
    "we",
    "will"
) == ["we", "rock"]

# No matching pair
assert sol.findOcurrences(
    "hello world example",
    "a",
    "b"
) == []

# Exactly three words with one match
assert sol.findOcurrences(
    "a b c",
    "a",
    "b"
) == ["c"]

# Overlapping occurrences
assert sol.findOcurrences(
    "a a a a",
    "a",
    "a"
) == ["a", "a"]

# Text shorter than three words
assert sol.findOcurrences(
    "hello world",
    "hello",
    "world"
) == []

# Multiple separated matches
assert sol.findOcurrences(
    "x y z x y a x y b",
    "x",
    "y"
) == ["z", "a", "b"]

# Match at the end of the text
assert sol.findOcurrences(
    "one two three four",
    "two",
    "three"
) == ["four"]

# Repeated words throughout the text
assert sol.findOcurrences(
    "go go stop go go wait",
    "go",
    "go"
) == ["stop", "wait"]
Test Why
"alice is a good girl she is a good student" Validates multiple matches
"we will we will rock you" Validates overlapping scanning behavior
"hello world example" Ensures empty result when no match exists
"a b c" Smallest valid input with one match
"a a a a" Tests overlapping occurrences
"hello world" Verifies handling of fewer than three words
"x y z x y a x y b" Tests multiple non-adjacent matches
"one two three four" Verifies match near the end
"go go stop go go wait" Tests repeated words and multiple results

Edge Cases

One important edge case occurs when the text contains fewer than three words. Since the required pattern contains exactly three consecutive words, no valid occurrence can exist. A naive implementation that blindly accesses words[i + 2] could produce an index error. The implementation avoids this by iterating only up to len(words) - 2.

Another important case involves overlapping matches. Consider:

"a a a a"

with:

first = "a"
second = "a"

There are two valid triples:

"a a a"
"a a a"

starting at different positions. Some implementations accidentally skip overlapping matches by advancing too aggressively after finding a match. This solution correctly checks every consecutive starting position.

A third edge case occurs when no matching pair exists anywhere in the text. The algorithm must return an empty list rather than None or an error. Since the result list is initialized as empty and only appended to on successful matches, the implementation naturally handles this case correctly.

Another subtle case appears when the third word is identical to either first or second. The algorithm does not assume uniqueness among words and simply records the third word whenever the first two positions match the target pair. This guarantees correct behavior even with repeated or identical words.