LeetCode 2904 - Shortest and Lexicographically Smallest Beautiful String

Difficulty: 🟡 Medium
Topics: String, Sliding Window

Solution

Problem Understanding

We are given a binary string s, meaning the string contains only the characters '0' and '1', along with a positive integer k. The goal is to find a substring of s that is considered beautiful, where a beautiful substring contains exactly k occurrences of '1'.

However, simply finding any beautiful substring is not enough. The problem introduces two additional requirements:

We must first minimize the length of the beautiful substring.
If there are multiple beautiful substrings with the same shortest length, we must return the lexicographically smallest one.

Lexicographical comparison works similarly to dictionary order. Since all candidate substrings will have the same length, we compare them character by character from left to right. The substring with a '0' at the first differing position is lexicographically smaller than one with a '1'.

For example:

"011" is lexicographically smaller than "101" because the first character differs and '0' < '1'.
"11001" is lexicographically smaller than "11100" because at the third character, '0' < '1'.

The constraints are relatively small:

1 <= s.length <= 100
1 <= k <= s.length

Because the string length is only up to 100, even quadratic or cubic approaches may be acceptable. However, we still want a clean and efficient solution.

There are several important edge cases to think about before designing the algorithm.

If the string does not contain enough '1' characters to ever reach exactly k, the answer must be an empty string. For example, s = "000" and k = 1.

Another tricky case occurs when multiple shortest beautiful substrings exist. We must compare them lexicographically and choose the smallest. For example, two candidate substrings may both have length 4, but one begins with '0', making it smaller.

We also need to handle cases where the shortest valid substring has length exactly k, which happens when the substring consists entirely of '1' characters, such as "11" for k = 2.

Approaches

Brute Force Approach

The brute force solution is straightforward. We generate every possible substring of s, count how many '1' characters it contains, and check whether it contains exactly k ones.

For every valid beautiful substring:

Compare its length against the shortest one found so far.
If it is shorter, update the answer.
If it has the same length, compare lexicographically and keep the smaller string.

Since there are O(n²) substrings, and counting ones inside each substring takes O(n) time, the total complexity becomes O(n³).

This works because n <= 100, but it performs unnecessary repeated work. Every substring repeatedly recounts the same characters.

Key Insight for an Optimal Solution

The important observation is that the condition depends only on the number of '1' characters, which naturally suggests a sliding window approach.

Instead of examining every substring independently, we maintain a window [left, right] and track how many '1' characters currently exist inside it.

As we expand the right pointer:

Add the new character into the window.
Track the number of ones.

Whenever the window contains exactly k ones, we attempt to shrink it from the left side as much as possible while preserving exactly k ones. This guarantees the current window is the shortest possible window ending at right.

Every valid minimal window becomes a candidate:

If its length is smaller than the best answer so far, update the answer.
If lengths are equal, compare lexicographically.

Because each pointer moves at most n times, the solution becomes linear.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n³)	O(1)	Enumerates all substrings and recounts ones repeatedly
Optimal	O(n²)	O(1)	Sliding window, substring extraction/comparison costs O(n)

Although the pointer movement itself is linear, substring slicing and lexicographical comparison can each take O(n) time. Since n <= 100, this remains efficient and practical.

Algorithm Walkthrough

Initialize two pointers, left and right, to represent the current sliding window. Also maintain a counter ones_count to track how many '1' characters exist inside the window.
Create variables to track the current best answer:

best_length, initialized to infinity.
best_substring, initialized to an empty string.

Move the right pointer from left to right through the string.
Whenever s[right] == '1', increment ones_count because a new '1' entered the window.
If the window contains more than k ones, move the left pointer forward until the window contains at most k ones again. Whenever a '1' leaves the window, decrement ones_count.
When ones_count == k, try to shrink the window as much as possible while preserving exactly k ones. Specifically, if the leftmost character is '0', move left forward because removing leading zeros does not change the count of ones and only shortens the substring.
After shrinking, the current window is the shortest beautiful substring ending at right.
Extract the candidate substring.
Compare it with the current best answer:

If it is shorter, replace the answer.
If it has the same length but is lexicographically smaller, replace the answer.

Continue scanning until the end of the string.
Return the stored best substring. If none was found, return an empty string.

Why it works

The sliding window maintains the invariant that the window never contains more than k ones after adjustment. Whenever the count becomes exactly k, shrinking away leading zeros guarantees the window is the shortest possible valid substring ending at the current position. Since every minimal candidate is considered, and we explicitly compare equal-length candidates lexicographically, the algorithm is guaranteed to return the shortest beautiful substring and, among ties, the lexicographically smallest one.

Python Solution

class Solution:
    def shortestBeautifulSubstring(self, s: str, k: int) -> str:
        left = 0
        ones_count = 0
        
        best_substring = ""
        best_length = float("inf")
        
        for right in range(len(s)):
            if s[right] == '1':
                ones_count += 1
            
            while ones_count > k:
                if s[left] == '1':
                    ones_count -= 1
                left += 1
            
            while ones_count == k and left <= right and s[left] == '0':
                left += 1
            
            if ones_count == k:
                candidate = s[left:right + 1]
                candidate_length = len(candidate)
                
                if candidate_length < best_length:
                    best_length = candidate_length
                    best_substring = candidate
                elif (
                    candidate_length == best_length
                    and candidate < best_substring
                ):
                    best_substring = candidate
        
        return best_substring

The implementation closely follows the sliding window strategy.

We begin by initializing the left boundary of the window and a counter for the number of ones. The right pointer iterates across the string.

Whenever a '1' enters the window, we increase ones_count. If the count exceeds k, we repeatedly move the left pointer forward until the constraint is restored.

Once the window contains exactly k ones, we greedily remove leading zeros. This is an important optimization because leading zeros increase the substring length without helping satisfy the requirement.

At this point, the window represents the shortest beautiful substring ending at the current right position. We extract it and compare it against the best answer found so far.

The comparison logic prioritizes shorter length first. If two candidates share the same length, we use Python's built-in lexicographical string comparison.

Finally, if no valid substring was found, the initial empty string remains unchanged and is returned.

Go Solution

func shortestBeautifulSubstring(s string, k int) string {
	left := 0
	onesCount := 0

	bestSubstring := ""
	bestLength := len(s) + 1

	for right := 0; right < len(s); right++ {
		if s[right] == '1' {
			onesCount++
		}

		for onesCount > k {
			if s[left] == '1' {
				onesCount--
			}
			left++
		}

		for onesCount == k && left <= right && s[left] == '0' {
			left++
		}

		if onesCount == k {
			candidate := s[left : right+1]
			candidateLength := len(candidate)

			if candidateLength < bestLength {
				bestLength = candidateLength
				bestSubstring = candidate
			} else if candidateLength == bestLength &&
				candidate < bestSubstring {
				bestSubstring = candidate
			}
		}
	}

	return bestSubstring
}

The Go implementation follows the same logic as the Python version. Since Go strings support lexicographical comparison using <, we can directly compare candidate substrings.

Instead of float("inf"), we initialize bestLength as len(s) + 1, which is guaranteed to be larger than any valid substring length.

Go strings are immutable and slicing uses index ranges, so extracting a candidate substring is efficient and concise.

Worked Examples

Example 1

Input:

s = "100011001", k = 3

Right	Character	Ones Count	Left	Window	Candidate
0	1	1	0	`"1"`	No
1	0	1	0	`"10"`	No
2	0	1	0	`"100"`	No
3	0	1	0	`"1000"`	No
4	1	2	0	`"10001"`	No
5	1	3	0	`"100011"`	`"100011"`
6	0	3	0	`"1000110"`	`"1000110"`
7	0	3	0	`"10001100"`	`"10001100"`
8	1	4	Shrink	`"11001"`	`"11001"`

At index 8, we exceed k, so we shrink from the left until only three ones remain. The resulting shortest valid substring becomes "11001", which has length 5, better than previous candidates.

Final answer:

"11001"

Example 2

Input:

s = "1011", k = 2

Right	Character	Ones Count	Left	Window	Candidate
0	1	1	0	`"1"`	No
1	0	1	0	`"10"`	No
2	1	2	0	`"101"`	`"101"`
3	1	3	Shrink	`"11"`	`"11"`

At the final step, shrinking removes extra characters and produces "11", which is shorter than "101".

Final answer:

"11"

Example 3

Input:

s = "000", k = 1

Right	Character	Ones Count	Candidate
0	0	0	No
1	0	0	No
2	0	0	No

The window never reaches one '1', so no beautiful substring exists.

Final answer:

""

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n²)	Window movement is O(n), substring extraction and comparison can cost O(n)
Space	O(1)	Only a few variables are maintained

Although the sliding window pointers each move at most n times, substring slicing and lexicographical comparisons may require scanning characters, making the practical worst-case complexity O(n²). Given the small constraint of n <= 100, this is highly efficient.

Test Cases

sol = Solution()

assert sol.shortestBeautifulSubstring("100011001", 3) == "11001"  # Example 1
assert sol.shortestBeautifulSubstring("1011", 2) == "11"  # Example 2
assert sol.shortestBeautifulSubstring("000", 1) == ""  # Example 3

assert sol.shortestBeautifulSubstring("1", 1) == "1"  # Single character valid
assert sol.shortestBeautifulSubstring("0", 1) == ""  # Single character invalid

assert sol.shortestBeautifulSubstring("1111", 2) == "11"  # Multiple shortest answers
assert sol.shortestBeautifulSubstring("10101", 2) == "101"  # Multiple candidates same size

assert sol.shortestBeautifulSubstring("0011100", 3) == "111"  # Exact compact group
assert sol.shortestBeautifulSubstring("1001", 1) == "1"  # Lexicographically smallest

assert sol.shortestBeautifulSubstring("010101", 2) == "101"  # Repeated patterns
assert sol.shortestBeautifulSubstring("1000001", 2) == "1000001"  # Entire string needed

assert sol.shortestBeautifulSubstring("110011", 2) == "11"  # Multiple equal shortest

Test	Why
`"100011001", 3`	Validates the main example
`"1011", 2`	Tests shrinking to shortest window
`"000", 1`	No beautiful substring exists
`"1", 1`	Smallest valid input
`"0", 1`	Smallest invalid input
`"1111", 2`	Many overlapping valid windows
`"10101", 2`	Multiple shortest candidates
`"0011100", 3`	Compact consecutive ones
`"1001", 1`	Lexicographical tie-breaking
`"010101", 2`	Alternating binary pattern
`"1000001", 2`	Whole string required
`"110011", 2`	Multiple equal shortest solutions

Edge Cases

No Valid Beautiful Substring

If the string does not contain at least k ones, no valid substring can exist. This can easily cause bugs if the algorithm assumes a solution always exists.

For example:

s = "000", k = 1

The implementation handles this naturally because ones_count never reaches k, so best_substring remains empty and is returned.

Multiple Shortest Candidates

A subtle issue occurs when several shortest beautiful substrings share the same length. Returning the first one found would be incorrect.

For example:

s = "1001", k = 1

Possible shortest substrings are "1" and "1", while larger candidates such as "001" also exist. The implementation explicitly compares equal-length candidates lexicographically to ensure the smallest answer is chosen.

Leading Zeros in the Window

Leading zeros are dangerous because they unnecessarily increase substring length without contributing toward the required number of ones.

For example:

s = "00111", k = 3

A naive implementation might return "00111" even though "111" is shorter.

The algorithm avoids this by aggressively shrinking away leading zeros whenever the window already contains exactly k ones, ensuring the minimal valid window is always considered.