LeetCode 2730 - Find the Longest Semi-Repetitive Substring

The problem gives us a string s consisting only of digits from 0 to 9. We need to find the length of the longest substring that is considered semi-repetitive. A substring is semi-repetitive if it contains at most one pair of equal adjacent digits.

LeetCode Problem 2730

Difficulty: 🟡 Medium
Topics: String, Sliding Window

Solution

Problem Understanding

The problem gives us a string s consisting only of digits from 0 to 9. We need to find the length of the longest substring that is considered semi-repetitive.

A substring is semi-repetitive if it contains at most one pair of equal adjacent digits. An adjacent pair means two neighboring characters that are the same. For example:

  • "5223" contains exactly one adjacent equal pair, "22", so it is valid.
  • "5494" contains no adjacent equal pairs, so it is also valid.
  • "111" contains two adjacent equal pairs because positions (0,1) and (1,2) are both "11", so it is invalid.

The task is not asking us to modify the string. We only need to examine all possible substrings and return the maximum valid length.

The constraints are very small:

  • 1 <= s.length <= 50

This means even quadratic or cubic solutions are acceptable from a performance perspective. However, the problem is designed to test recognition of sliding window patterns, so an optimal linear-time solution is expected.

Several edge cases are important:

  • A string with no repeated adjacent digits at all, such as "12345", should return the entire length.
  • A string where every digit is the same, such as "111111", should return 2, because any substring of length 3 already contains two adjacent equal pairs.
  • Very short strings like length 1 should always return 1.
  • Overlapping adjacent pairs are counted separately. For example, "111" has two adjacent equal pairs, not one.

Understanding how overlapping pairs behave is critical for avoiding off-by-one mistakes.

Approaches

Brute Force Approach

The most direct solution is to generate every possible substring and check whether it is semi-repetitive.

For every starting index i, we try every ending index j, forming the substring s[i:j+1]. Then we scan that substring and count how many adjacent equal pairs it contains. If the count is at most one, we update the answer.

This approach is correct because it explicitly examines every possible substring and validates each one according to the problem definition.

However, this solution is inefficient because:

  • There are O(n^2) substrings.
  • Each substring validation may take O(n) time.

This leads to an overall complexity of O(n^3).

Even though n <= 50 makes this acceptable in practice, it is not the most elegant solution.

Optimal Sliding Window Approach

The key observation is that we only care about the number of adjacent equal pairs inside the current substring.

This immediately suggests a sliding window approach:

  • Expand the right side of the window one character at a time.
  • Track how many adjacent equal pairs exist inside the window.
  • If the count exceeds one, move the left side forward until the window becomes valid again.

The important insight is that when we extend the window by one character, only one new adjacent relationship is introduced:

s[right - 1] == s[right]

Similarly, when we move the left pointer forward, only one adjacent relationship may disappear:

s[left] == s[left + 1]

Because the validity condition depends only on local adjacent relationships, we can maintain the answer incrementally in linear time.

Approach Time Complexity Space Complexity Notes
Brute Force O(n³) O(1) Checks every substring independently
Optimal O(n) O(1) Sliding window tracking adjacent equal pairs

Algorithm Walkthrough

  1. Initialize two pointers, left and right, to represent the current sliding window.
  2. Maintain a variable called pair_count, which stores how many adjacent equal digit pairs exist inside the current window.
  3. Start expanding the window by moving right from left to right across the string.
  4. Whenever we add a new character at position right, check whether:
s[right] == s[right - 1]

If true, we have introduced a new adjacent equal pair, so increment pair_count. 5. If pair_count becomes greater than 1, the window is no longer semi-repetitive. We must shrink it from the left. 6. While shrinking:

  • Check whether removing s[left] also removes an adjacent equal pair:
s[left] == s[left + 1]
  • If true, decrement pair_count.
  • Move left forward by one position.
  1. After restoring validity, the current window satisfies the problem condition again.
  2. Compute the current window length:
right - left + 1

Update the maximum answer if this window is larger. 9. Continue until the entire string has been processed.

Why it works

The sliding window always maintains the invariant that the current substring contains at most one adjacent equal pair.

When the window becomes invalid, we move the left pointer just enough to restore validity. Since every character enters and leaves the window at most once, the algorithm processes the string efficiently while guaranteeing that every valid candidate substring is considered.

Python Solution

class Solution:
    def longestSemiRepetitiveSubstring(self, s: str) -> int:
        left = 0
        pair_count = 0
        max_length = 1

        for right in range(1, len(s)):
            if s[right] == s[right - 1]:
                pair_count += 1

            while pair_count > 1:
                if s[left] == s[left + 1]:
                    pair_count -= 1
                left += 1

            max_length = max(max_length, right - left + 1)

        return max_length

The implementation uses the standard sliding window structure.

The variable left stores the beginning of the current valid substring. The loop variable right expands the window one character at a time.

Whenever the newly added character creates an adjacent equal pair, pair_count is incremented. This variable always tracks the number of repeated adjacent pairs inside the current window.

If the window becomes invalid because pair_count > 1, the inner while loop shrinks the window from the left side until the condition is restored.

The expression:

right - left + 1

computes the current valid substring length, and max_length stores the best result seen so far.

The algorithm uses only constant extra memory and scans the string once.

Go Solution

func longestSemiRepetitiveSubstring(s string) int {
    left := 0
    pairCount := 0
    maxLength := 1

    for right := 1; right < len(s); right++ {
        if s[right] == s[right-1] {
            pairCount++
        }

        for pairCount > 1 {
            if s[left] == s[left+1] {
                pairCount--
            }
            left++
        }

        currentLength := right - left + 1
        if currentLength > maxLength {
            maxLength = currentLength
        }
    }

    return maxLength
}

The Go implementation follows the same logic as the Python solution.

Since strings in Go are byte-indexable and the input contains only ASCII digits, direct indexing with s[i] is safe and efficient.

No special handling for integer overflow is needed because the maximum string length is only 50.

Worked Examples

Example 1

Input:

s = "52233"
Step right Character pair_count left Current Window Valid max_length
Start 0 5 0 0 "5" Yes 1
1 1 2 0 0 "52" Yes 2
2 2 2 1 0 "522" Yes 3
3 3 3 1 0 "5223" Yes 4
4 4 3 2 0 "52233" No 4

Now the window is invalid because it contains "22" and "33".

Shrink from the left:

  • Remove '5', pair count unchanged.
  • Remove first '2', this removes "22".

Now:

Step pair_count left Window
After shrinking 1 2 "233"

Final answer:

4

Example 2

Input:

s = "5494"
Step right pair_count Window max_length
0 0 0 "5" 1
1 1 0 "54" 2
2 2 0 "549" 3
3 3 0 "5494" 4

No adjacent equal pairs ever appear, so the whole string is valid.

Final answer:

4

Example 3

Input:

s = "1111111"
Step right pair_count Window Before Shrink Action
1 1 1 "11" Valid
2 2 2 "111" Shrink
After shrink 2 1 "11" Valid
3 3 2 "111" Shrink
After shrink 3 1 "11" Valid

This pattern repeats for the entire string.

The largest valid window length is always:

2

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character enters and leaves the sliding window at most once
Space O(1) Only a few integer variables are used

The algorithm is linear because both pointers move only forward. Even though there is a nested while loop, the total number of left-pointer movements across the entire execution is at most n.

The solution uses constant extra memory because it stores only counters and indices.

Test Cases

sol = Solution()

assert sol.longestSemiRepetitiveSubstring("52233") == 4  # Example with two separate pairs
assert sol.longestSemiRepetitiveSubstring("5494") == 4   # Entire string valid
assert sol.longestSemiRepetitiveSubstring("1111111") == 2  # All digits identical

assert sol.longestSemiRepetitiveSubstring("1") == 1  # Single character
assert sol.longestSemiRepetitiveSubstring("12") == 2  # No repeated pair
assert sol.longestSemiRepetitiveSubstring("11") == 2  # Exactly one pair

assert sol.longestSemiRepetitiveSubstring("112") == 3  # One pair at start
assert sol.longestSemiRepetitiveSubstring("122") == 3  # One pair at end
assert sol.longestSemiRepetitiveSubstring("1212") == 4  # No adjacent duplicates

assert sol.longestSemiRepetitiveSubstring("112233") == 4  # Multiple pairs
assert sol.longestSemiRepetitiveSubstring("1233445") == 5  # Pair near middle
assert sol.longestSemiRepetitiveSubstring("10001") == 5  # One repeated pair only

assert sol.longestSemiRepetitiveSubstring("001100") == 4  # Multiple repeated sections
assert sol.longestSemiRepetitiveSubstring("101010") == 6  # Alternating digits
assert sol.longestSemiRepetitiveSubstring("2221222") == 4  # Overlapping repeated pairs
Test Why
"52233" Verifies handling of multiple adjacent pairs
"5494" Confirms entire string can be valid
"1111111" Tests overlapping adjacent equal pairs
"1" Minimum input size
"12" No repeated digits
"11" Single valid adjacent pair
"112" Pair at beginning
"122" Pair at end
"1212" Alternating digits
"112233" Multiple separated repeated pairs
"1233445" Repeated pair in middle
"10001" One repeated pair surrounded by distinct digits
"001100" Multiple repeated regions
"101010" Entire alternating string valid
"2221222" Stress test for overlapping repetitions

Edge Cases

One important edge case is a string where every character is identical, such as "111111". This case is tricky because adjacent equal pairs overlap. A naive implementation might incorrectly count "111" as containing only one repeated pair, but it actually contains two: positions (0,1) and (1,2). The sliding window correctly handles this by incrementing pair_count for every adjacent match and shrinking whenever the count exceeds one.

Another important case is very short strings, especially length 1. Since a single character cannot contain any adjacent pair, the answer must always be 1. The implementation initializes max_length to 1, ensuring correct behavior even when the loop never executes.

A third important edge case is when the valid substring spans nearly the entire string except for one extra repeated pair, such as "52233". Incorrect shrinking logic may remove too many characters and miss the optimal answer. The implementation carefully removes only the minimum number of characters necessary to restore the invariant pair_count <= 1.

A final subtle case involves repeated pairs at the boundaries, such as "112345" or "123455". Boundary conditions often create off-by-one mistakes when checking adjacent indices. The solution safely checks adjacency only where valid indices exist and maintains correct window boundaries throughout the scan.