LeetCode 686 - Repeated String Match

The problem asks us to determine the minimum number of times a string a must be repeated such that another string b becomes a substring of the repeated version of a. A substring is a consecutive sequence of characters within a string.

LeetCode Problem 686

Difficulty: 🟡 Medium
Topics: String, String Matching

Solution

Problem Understanding

The problem asks us to determine the minimum number of times a string a must be repeated such that another string b becomes a substring of the repeated version of a. A substring is a consecutive sequence of characters within a string. The input consists of two strings, a and b, both of length between 1 and 10,000 characters. The output is an integer representing the minimum number of repetitions, or -1 if no number of repetitions will make b a substring of repeated a.

In essence, we are repeatedly concatenating a with itself and checking whether b appears anywhere in the resulting string. The problem requires efficiency because a naive approach of generating every possible repetition could be prohibitively slow for long strings.

Important edge cases include scenarios where b is longer than a, where b overlaps across the boundary between two repetitions of a, and cases where b contains characters not present in a at all. The constraints guarantee that a and b are non-empty and composed of lowercase English letters, but they do not guarantee that b will ever appear in repeated a.

Approaches

The brute-force approach is straightforward: start with an empty string, keep concatenating a, and after each concatenation, check if b is a substring. This approach is guaranteed to find the correct answer because we are exhaustively checking all possible repetitions. However, its time complexity is high because string concatenation and substring checking can become expensive for large input sizes, especially when b is long.

The key insight for a more optimal solution is that b can only start overlapping with a at most once across a repetition boundary. Therefore, we do not need to check an unbounded number of repetitions. In practice, repeating a enough times to exceed the length of b plus one extra repetition is sufficient to determine whether b can be a substring. This is because b could begin near the end of one repetition and extend into the next, but it cannot extend beyond two repetitions beyond its length.

Approach Time Complexity Space Complexity Notes
Brute Force O(n * m * k) O(n * k) Concatenate a repeatedly and check substring, slow for large strings
Optimal O(n + m) O(n + m) Repeat a minimally to exceed b length and check substring, leveraging Python/Go substring search

Algorithm Walkthrough

  1. Compute the minimum number of repetitions needed so that the length of the repeated string is at least as long as b. This is calculated as ceil(len(b) / len(a)).
  2. Repeat a that many times to form an initial candidate string.
  3. Check if b is a substring of the candidate. If yes, return the current repetition count.
  4. If b is not found, append one more repetition of a to handle the case where b overlaps the boundary between the last repetition and the new repetition.
  5. Check again if b is now a substring. If yes, return the updated repetition count.
  6. If b is still not found, return -1 because no repetition will contain b.

Why it works: This algorithm works because b can only span across two consecutive repetitions of a. By repeating a just enough to cover b and one additional repetition, we ensure all possible positions of b are checked without unnecessary concatenation, making the algorithm efficient.

Python Solution

class Solution:
    def repeatedStringMatch(self, a: str, b: str) -> int:
        from math import ceil
        
        # Calculate minimum repetitions needed
        min_repeats = ceil(len(b) / len(a))
        repeated_a = a * min_repeats
        
        # Check if b is a substring
        if b in repeated_a:
            return min_repeats
        
        # Check one more repetition for overlap
        repeated_a += a
        if b in repeated_a:
            return min_repeats + 1
        
        return -1

The Python solution starts by calculating the minimal number of repetitions needed to cover b. It first checks if b is already a substring after these repetitions. If not, it appends one more repetition to handle overlapping cases and checks again. If b is still absent, it returns -1.

Go Solution

func repeatedStringMatch(a string, b string) int {
    minRepeats := (len(b) + len(a) - 1) / len(a) // ceil(len(b)/len(a))
    repeatedA := ""
    for i := 0; i < minRepeats; i++ {
        repeatedA += a
    }
    
    if strings.Contains(repeatedA, b) {
        return minRepeats
    }
    
    repeatedA += a
    if strings.Contains(repeatedA, b) {
        return minRepeats + 1
    }
    
    return -1
}

The Go implementation mirrors Python closely. The primary difference is the integer ceiling calculation using (len(b) + len(a) - 1) / len(a), and strings.Contains is used for substring checks. Go handles strings as immutable sequences of bytes, so repeated concatenation is safe here.

Worked Examples

Example 1: a = "abcd", b = "cdabcdab"

  1. min_repeats = ceil(8 / 4) = 2
  2. repeated_a = "abcd" * 2 = "abcdabcd"
  3. Check substring: "cdabcdab" not in "abcdabcd"
  4. Add one more repetition: "abcdabcdabcd"
  5. Check substring: "cdabcdab" in "abcdabcdabcd" → return 3

Example 2: a = "a", b = "aa"

  1. min_repeats = ceil(2 / 1) = 2
  2. repeated_a = "aa"
  3. Check substring: "aa" in "aa" → return 2

Complexity Analysis

Measure Complexity Explanation
Time O(n + m) n is length of a, m is length of b. Repeating a and checking substring is linear in total length of repeated string.
Space O(n + m) Space needed for repeated string, which is at most len(b) + 2*len(a)

The reasoning is that the algorithm does minimal concatenation and only checks substring twice. This is efficient for the input constraints.

Test Cases

# Basic examples
assert Solution().repeatedStringMatch("abcd", "cdabcdab") == 3  # Example 1
assert Solution().repeatedStringMatch("a", "aa") == 2           # Example 2

# Edge cases
assert Solution().repeatedStringMatch("abc", "xyz") == -1       # Characters not in `a`
assert Solution().repeatedStringMatch("abc", "cabcabca") == 4   # Overlap multiple times
assert Solution().repeatedStringMatch("a", "a") == 1            # Single character match
assert Solution().repeatedStringMatch("ab", "bab") == 2         # Overlap at start
assert Solution().repeatedStringMatch("ab", "baba") == 3        # Overlap requires 3 repeats
Test Why
"abcd", "cdabcdab" Validates overlap across repetitions
"a", "aa" Single character repeated
"abc", "xyz" b not in a at all
"abc", "cabcabca" Overlap with multiple repeats
"a", "a" Minimal length strings
"ab", "bab" Overlap starting at second character
"ab", "baba" Requires more than minimal repeats due to overlap

Edge Cases

Case 1: b not present in a at all. If b contains characters not in a, any repetition will fail. Our implementation handles this naturally by returning -1 after checking the repeated strings.

Case 2: b overlaps the boundary of repetitions. For example, a = "ab" and b = "bab". A naive approach might stop after two repetitions, missing the overlap. By appending one extra repetition, our solution correctly detects b.

Case 3: Minimal strings. When a and b are both single characters, the algorithm still works correctly, returning 1 if they match or -1 if not. The solution does not assume strings are longer than 1 character and handles the smallest possible input sizes.