LeetCode 3006 - Find Beautiful Indices in the Given Array I
The problem gives us a string s and two smaller strings, a and b. We need to find every index i where substring a appears in s, but only if there exists at least one occurrence of substring b close enough to it. More formally, an index i is considered beautiful when: 1.
Difficulty: 🟡 Medium
Topics: Two Pointers, String, Binary Search, Rolling Hash, String Matching, Hash Function
Solution
Problem Understanding
The problem gives us a string s and two smaller strings, a and b. We need to find every index i where substring a appears in s, but only if there exists at least one occurrence of substring b close enough to it.
More formally, an index i is considered beautiful when:
- The substring starting at index
imatchesa - There exists another index
jwhere the substring starting atjmatchesb - The distance between
iandjis at mostk
The output must contain all beautiful indices in sorted order.
The important observation is that we are not comparing the actual substrings character by character repeatedly for every possible pair. Instead, we are really working with positions where a occurs and positions where b occurs.
For example, if:
s = "abcxxabcxxabc"
a = "abc"
b = "xx"
then we first identify all positions where "abc" appears and all positions where "xx" appears. After that, the task becomes checking whether each a position has a nearby b position within distance k.
The constraints are important:
s.lengthcan be as large as10^5a.lengthandb.lengthare at most10
The small pattern lengths mean substring comparisons are cheap, but the large string size means we cannot afford quadratic behavior.
A naive implementation that compares every occurrence of a against every occurrence of b could become too slow in the worst case.
Some important edge cases include:
aandbmay be the same string- There may be overlapping matches
- One of the patterns may appear many times
- There may be no valid beautiful indices
kmay be very large, allowing almost every matchaorbmay have length1
The problem guarantees lowercase English letters only, so we do not need to handle Unicode or special character issues.
Approaches
Brute Force Approach
The brute-force strategy is straightforward.
First, scan the string and collect every index where a appears. Then collect every index where b appears.
After that, for every occurrence of a, check every occurrence of b. If at least one b occurrence satisfies:
|i - j| <= k
then i is beautiful.
This works because it explicitly tests the exact condition from the problem statement.
However, this approach can become inefficient. Suppose both a and b appear nearly n times. Then we may perform up to O(n^2) comparisons between positions.
With n = 10^5, quadratic behavior is too slow.
Optimal Approach
The key insight is that the occurrence positions are naturally sorted because we discover them while scanning the string from left to right.
Once we have:
- a sorted list of positions for
a - a sorted list of positions for
b
we can efficiently determine whether a nearby b exists for each a.
Instead of checking every pair, we use binary search.
For each index i where a appears:
- We binary search in the sorted
bpositions - We find the first
bposition that is greater than or equal toi - k - Then we only need to check whether that position is also less than or equal to
i + k
This reduces the search time dramatically.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | Compare every a occurrence against every b occurrence |
| Optimal | O(n log n) | O(n) | Use binary search on sorted occurrence positions |
Algorithm Walkthrough
- Create a helper function that finds all occurrences of a pattern inside
s.
We scan every valid starting position and compare the substring with the target pattern. Since the pattern lengths are at most 10, this comparison is very cheap.
2. Use the helper function to collect all starting indices where a appears.
Store these indices in a list called a_positions.
3. Use the helper function again to collect all starting indices where b appears.
Store these indices in a list called b_positions.
4. Initialize an empty result list.
This list will store all beautiful indices.
5. For each index i in a_positions, search for a nearby b occurrence.
We want to know whether there exists some j in b_positions such that:
i - k <= j <= i + k
- Use binary search to find the first
bposition that is at leasti - k.
Since b_positions is sorted, Python's bisect_left or Go's sort.SearchInts can do this efficiently.
7. Check whether the found position is valid.
If the binary search result points to a position within bounds and that position is at most i + k, then a nearby b occurrence exists.
8. Add i to the result list.
Since we process a_positions in sorted order, the result automatically remains sorted.
9. Return the result list.
Why it works
The algorithm works because all occurrences of b are stored in sorted order. Binary search guarantees that we efficiently locate the smallest possible b index that could satisfy the distance constraint.
If this candidate is larger than i + k, then every later b occurrence will also be too large. If it is within the range, then a valid nearby occurrence exists.
Therefore, every beautiful index is correctly identified, and no valid index is missed.
Python Solution
from bisect import bisect_left
from typing import List
class Solution:
def beautifulIndices(self, s: str, a: str, b: str, k: int) -> List[int]:
def find_occurrences(pattern: str) -> List[int]:
positions = []
pattern_length = len(pattern)
for start in range(len(s) - pattern_length + 1):
if s[start:start + pattern_length] == pattern:
positions.append(start)
return positions
a_positions = find_occurrences(a)
b_positions = find_occurrences(b)
result = []
for index in a_positions:
left_bound = index - k
candidate_index = bisect_left(b_positions, left_bound)
if (
candidate_index < len(b_positions)
and b_positions[candidate_index] <= index + k
):
result.append(index)
return result
The implementation begins with a helper function named find_occurrences. This function scans every possible starting position and records where the pattern matches exactly.
After collecting all positions for a and b, the algorithm iterates through every occurrence of a.
For each index, we compute the smallest acceptable b position:
left_bound = index - k
Using bisect_left, we efficiently locate the first b occurrence that is not smaller than this boundary.
If the found occurrence also satisfies:
b_positions[candidate_index] <= index + k
then the distance requirement is satisfied, and the current a position is beautiful.
The result list is returned in sorted order because the occurrence lists are generated from left to right.
Go Solution
package main
import "sort"
func beautifulIndices(s string, a string, b string, k int) []int {
findOccurrences := func(pattern string) []int {
var positions []int
patternLength := len(pattern)
for start := 0; start <= len(s)-patternLength; start++ {
if s[start:start+patternLength] == pattern {
positions = append(positions, start)
}
}
return positions
}
aPositions := findOccurrences(a)
bPositions := findOccurrences(b)
var result []int
for _, index := range aPositions {
leftBound := index - k
candidateIndex := sort.SearchInts(bPositions, leftBound)
if candidateIndex < len(bPositions) &&
bPositions[candidateIndex] <= index+k {
result = append(result, index)
}
}
return result
}
The Go solution follows the same logic as the Python implementation.
The main difference is the use of sort.SearchInts instead of bisect_left. This function performs binary search on a sorted slice and returns the first position where the value could be inserted while maintaining order.
Go slices are dynamically resized during append, so we do not need manual memory management.
There are no integer overflow concerns because indices remain within the range of 10^5.
Worked Examples
Example 1
s = "isawsquirrelnearmysquirrelhouseohmy"
a = "my"
b = "squirrel"
k = 15
Step 1: Find occurrences of a
We scan for "my".
| Index | Substring | Match |
|---|---|---|
| 16 | "my" | Yes |
| 33 | "my" | Yes |
So:
a_positions = [16, 33]
Step 2: Find occurrences of b
We scan for "squirrel".
| Index | Substring | Match |
|---|---|---|
| 4 | "squirrel" | Yes |
| 18 | "squirrel" | Yes |
So:
b_positions = [4, 18]
Step 3: Check each a occurrence
Check index 16
Allowed range:
[16 - 15, 16 + 15] = [1, 31]
Binary search finds the first b position greater than or equal to 1.
That position is 4.
Since:
4 <= 31
index 16 is beautiful.
Check index 33
Allowed range:
[18, 48]
Binary search finds 18.
Since:
18 <= 48
index 33 is beautiful.
Final result:
[16, 33]
Example 2
s = "abcd"
a = "a"
b = "a"
k = 4
Step 1: Find occurrences
a_positions = [0]
b_positions = [0]
Step 2: Check index 0
Allowed range:
[-4, 4]
Binary search finds 0.
Since:
0 <= 4
index 0 is beautiful.
Final result:
[0]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Finding occurrences is linear, each binary search costs O(log n) |
| Space | O(n) | Occurrence lists may store up to O(n) indices |
The substring matching phase is effectively linear because a.length and b.length are at most 10, making each substring comparison constant time.
The binary search phase performs one O(log n) search for every occurrence of a. In the worst case, there may be O(n) occurrences, producing overall complexity of O(n log n).
The space usage comes from storing occurrence indices for both patterns.
Test Cases
from typing import List
class Solution:
def beautifulIndices(self, s: str, a: str, b: str, k: int) -> List[int]:
from bisect import bisect_left
def find_occurrences(pattern: str) -> List[int]:
positions = []
for i in range(len(s) - len(pattern) + 1):
if s[i:i + len(pattern)] == pattern:
positions.append(i)
return positions
a_positions = find_occurrences(a)
b_positions = find_occurrences(b)
result = []
for index in a_positions:
pos = bisect_left(b_positions, index - k)
if pos < len(b_positions) and b_positions[pos] <= index + k:
result.append(index)
return result
solution = Solution()
assert solution.beautifulIndices(
"isawsquirrelnearmysquirrelhouseohmy",
"my",
"squirrel",
15
) == [16, 33] # Provided example 1
assert solution.beautifulIndices(
"abcd",
"a",
"a",
4
) == [0] # Provided example 2
assert solution.beautifulIndices(
"aaaaa",
"aa",
"aa",
0
) == [0, 1, 2, 3] # Overlapping matches
assert solution.beautifulIndices(
"abcdef",
"ab",
"ef",
1
) == [] # No nearby occurrence
assert solution.beautifulIndices(
"abcabcabc",
"abc",
"abc",
100
) == [0, 3, 6] # Large k allows all matches
assert solution.beautifulIndices(
"aaaa",
"a",
"a",
1
) == [0, 1, 2, 3] # Single character patterns
assert solution.beautifulIndices(
"abababab",
"ab",
"ba",
1
) == [0, 2, 4, 6] # Alternating overlaps
assert solution.beautifulIndices(
"xyz",
"a",
"b",
2
) == [] # No occurrences at all
assert solution.beautifulIndices(
"aaaaab",
"b",
"aaaaa",
5
) == [5] # Match near beginning
assert solution.beautifulIndices(
"abcde",
"cd",
"ab",
2
) == [2] # Exact boundary distance
| Test | Why |
|---|---|
| Example 1 | Validates standard multi-match scenario |
| Example 2 | Validates same pattern case |
"aaaaa", "aa", "aa" |
Tests overlapping substring matches |
"abcdef" |
Tests no valid beautiful indices |
Large k case |
Ensures wide search range works |
| Single-character patterns | Tests smallest pattern lengths |
| Alternating overlaps | Verifies nearby matching logic |
| No occurrences | Ensures empty result handling |
| Match near start | Tests boundary handling |
| Exact distance boundary | Verifies <= k condition |
Edge Cases
One important edge case occurs when a and b are the same string. In this situation, every occurrence of a automatically has at least one matching b occurrence at the same index. A buggy implementation might accidentally exclude self-matching positions. This solution handles it naturally because binary search can return the same index.
Another tricky case involves overlapping matches. For example, in "aaaaa" with pattern "aa", valid matches occur at indices 0, 1, 2, and 3. Some implementations incorrectly skip overlapping substrings by advancing too far after a match. This implementation checks every possible starting index, so overlaps are preserved correctly.
A third edge case appears when no valid nearby b occurrence exists. Binary search may return an index equal to the length of b_positions, meaning there is no suitable candidate. The implementation explicitly checks bounds before accessing the array, preventing index errors and correctly excluding invalid indices.
A final edge case involves very large values of k. When k exceeds the distance between almost all positions, many indices become beautiful. The algorithm still works efficiently because binary search complexity does not depend on the size of k.