LeetCode 3 - Longest Substring Without Repeating Characters

The problem gives a string s and asks for the length of the longest substring that contains no repeated characters. A substring is a continuous section of the string. This detail matters because characters must remain adjacent.

LeetCode Problem 3

Difficulty: 🟡 Medium
Topics: Hash Table, String, Sliding Window

Solution

Longest Substring Without Repeating Characters

Problem Understanding

The problem gives a string s and asks for the length of the longest substring that contains no repeated characters.

A substring is a continuous section of the string. This detail matters because characters must remain adjacent. For example, in "pwwkew", the sequence "pwke" is not valid because those characters are not contiguous in the original string.

The input is a single string that may contain letters, digits, symbols, and spaces. The output is a single integer representing the maximum length of any substring where every character appears at most once.

The constraints tell us that the string length can be as large as 5 * 10^4. A quadratic solution may become too slow at this scale. For example, an O(n^2) solution would potentially perform billions of operations in the worst case. This strongly suggests that we need a linear or near linear solution.

Several edge cases are important to think about early:

  • The string may be empty. In that case, the answer should be 0.
  • The string may contain only one repeated character, such as "bbbb". The answer is 1.
  • The longest valid substring may appear in the middle of the string rather than at the beginning.
  • Duplicate characters may appear very close together or very far apart.
  • The string may contain spaces or symbols, which should be treated exactly like normal characters.

The key challenge is efficiently maintaining a substring that contains only unique characters while scanning through the string.

Approaches

Brute Force Approach

A straightforward solution is to generate every possible substring and check whether it contains duplicate characters.

For each starting index, we can extend the substring character by character while storing seen characters in a set. If we encounter a duplicate, we stop extending that substring and move to the next starting position.

This works because it explicitly checks all possible substrings and records the maximum valid length.

However, this approach is too slow for large inputs. There are O(n^2) possible substrings, and checking uniqueness may also require additional work. Even with optimizations, the total runtime becomes quadratic.

For a string length of 50,000, a quadratic solution is not practical.

Optimal Approach, Sliding Window

The key observation is that we do not need to restart from scratch every time we encounter a duplicate character.

Suppose we already know that a substring is valid. When we extend it by one character, there are only two possibilities:

  • The new character is not already inside the current substring, so the window remains valid.
  • The new character already exists in the current substring, so we must shrink the window from the left until the duplicate is removed.

This leads naturally to the sliding window technique.

We maintain two pointers:

  • left, the start of the current window
  • right, the end of the current window

We also maintain a hash set or hash map to track which characters are currently inside the window.

As we move right forward through the string, we expand the window. Whenever a duplicate appears, we move left forward until the window becomes valid again.

Each character is added and removed at most once, which gives a linear time solution.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(min(n, charset)) Checks every possible substring
Optimal O(n) O(min(n, charset)) Sliding window with hash set

Algorithm Walkthrough

  1. Initialize two pointers, left and right, both starting at 0. These pointers define the current sliding window.
  2. Create a hash set called seen_characters. This set stores all characters currently inside the window. A hash set is used because membership checks, insertions, and removals are all O(1) on average.
  3. Initialize a variable max_length to 0. This variable tracks the longest valid substring found so far.
  4. Iterate through the string using the right pointer.
  5. For each character at position right, check whether it already exists in seen_characters.
  6. If the character already exists, the current window is invalid because it contains duplicates. Move the left pointer forward while removing characters from the set until the duplicate character is removed.
  7. Once the duplicate is removed, add the current character to the set.
  8. Compute the current window length using:

$\text{window length} = right - left + 1$ 9. Update max_length if the current window is larger than the previous best answer. 10. Continue until the right pointer reaches the end of the string.

Why it works

At every step, the sliding window maintains an important invariant:

The substring between left and right always contains unique characters.

Whenever a duplicate appears, the algorithm shrinks the window just enough to restore validity. Because both pointers only move forward, every character is processed at most twice, once when entering the window and once when leaving it. This guarantees both correctness and linear runtime.

Python Solution

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        seen_characters = set()
        left = 0
        max_length = 0

        for right in range(len(s)):
            while s[right] in seen_characters:
                seen_characters.remove(s[left])
                left += 1

            seen_characters.add(s[right])

            current_length = right - left + 1
            max_length = max(max_length, current_length)

        return max_length

The implementation follows the sliding window algorithm directly.

The seen_characters set tracks which characters currently exist inside the active window. The left pointer marks the beginning of the substring, while the loop variable right expands the window one character at a time.

Whenever the character at right already exists inside the set, the while loop removes characters from the left side until the duplicate disappears. This guarantees that the window always contains unique characters.

After the window becomes valid, the current character is inserted into the set. The window length is then computed and compared against max_length.

The solution handles all edge cases naturally. If the string is empty, the loop never executes and the function correctly returns 0.

Go Solution

func lengthOfLongestSubstring(s string) int {
    seenCharacters := make(map[byte]bool)

    left := 0
    maxLength := 0

    for right := 0; right < len(s); right++ {
        for seenCharacters[s[right]] {
            delete(seenCharacters, s[left])
            left++
        }

        seenCharacters[s[right]] = true

        currentLength := right - left + 1

        if currentLength > maxLength {
            maxLength = currentLength
        }
    }

    return maxLength
}

The Go implementation uses a map[byte]bool instead of a Python set. The logic remains the same.

Go strings are byte slices, so indexing returns a byte. Since the problem uses standard ASCII style characters, byte based indexing works correctly here.

The delete function removes characters from the map when the left side of the window moves forward.

Unlike Python, Go does not have a built in max function for integers, so the comparison is written explicitly using an if statement.

Worked Examples

Example 1

Input:

s = "abcabcbb"
Step right Character left Window Seen Characters max_length
1 0 a 0 "a" {a} 1
2 1 b 0 "ab" {a, b} 2
3 2 c 0 "abc" {a, b, c} 3
4 3 a 1 "bca" {b, c, a} 3
5 4 b 2 "cab" {c, a, b} 3
6 5 c 3 "abc" {a, b, c} 3
7 6 b 5 "cb" {c, b} 3
8 7 b 7 "b" {b} 3

Final answer: 3

Example 2

Input:

s = "bbbbb"
Step right Character left Window Seen Characters max_length
1 0 b 0 "b" {b} 1
2 1 b 1 "b" {b} 1
3 2 b 2 "b" {b} 1
4 3 b 3 "b" {b} 1
5 4 b 4 "b" {b} 1

Final answer: 1

Example 3

Input:

s = "pwwkew"
Step right Character left Window Seen Characters max_length
1 0 p 0 "p" {p} 1
2 1 w 0 "pw" {p, w} 2
3 2 w 2 "w" {w} 2
4 3 k 2 "wk" {w, k} 2
5 4 e 2 "wke" {w, k, e} 3
6 5 w 3 "kew" {k, e, w} 3

Final answer: 3

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each character enters and leaves the window at most once
Space O(min(n, charset)) The hash set stores unique characters currently in the window

The runtime is linear because both pointers move only forward through the string. Even though there is a nested while loop, the total number of left pointer movements across the entire algorithm is at most n.

The space complexity depends on how many unique characters can exist simultaneously inside the sliding window. In the worst case, all characters are distinct.

Test Cases

solution = Solution()

assert solution.lengthOfLongestSubstring("") == 0
# Empty string

assert solution.lengthOfLongestSubstring("a") == 1
# Single character

assert solution.lengthOfLongestSubstring("bbbbb") == 1
# All characters identical

assert solution.lengthOfLongestSubstring("abcabcbb") == 3
# Repeating pattern

assert solution.lengthOfLongestSubstring("pwwkew") == 3
# Duplicate inside sliding window

assert solution.lengthOfLongestSubstring("abcdef") == 6
# All unique characters

assert solution.lengthOfLongestSubstring("abba") == 2
# Duplicate forces multiple left shifts

assert solution.lengthOfLongestSubstring("dvdf") == 3
# Window shrinks and expands correctly

assert solution.lengthOfLongestSubstring(" ") == 1
# Single space character

assert solution.lengthOfLongestSubstring("aab") == 2
# Duplicate at beginning

assert solution.lengthOfLongestSubstring("tmmzuxt") == 5
# Longest substring appears near the end

assert solution.lengthOfLongestSubstring("anviaj") == 5
# Complex overlapping duplicates

assert solution.lengthOfLongestSubstring("abcdefghijklmnopqrstuvwxyz") == 26
# Large unique substring
Test Why
"" Validates empty input handling
"a" Smallest non empty string
"bbbbb" All characters repeated
"abcabcbb" Standard repeating pattern
"pwwkew" Validates true substring handling
"abcdef" Entire string is valid
"abba" Tests repeated character inside active window
"dvdf" Ensures shrinking logic works correctly
" " Confirms spaces are treated as characters
"aab" Duplicate at the beginning
"tmmzuxt" Longest substring appears later
"anviaj" Overlapping duplicate regions
"abcdefghijklmnopqrstuvwxyz" Large fully unique substring

Edge Cases

Empty String

An empty string is an important boundary case because there are no substrings at all. A buggy implementation might incorrectly return 1 or attempt invalid indexing operations.

This implementation handles the case naturally. The loop never executes, max_length remains 0, and the correct answer is returned.

All Characters Repeated

A string like "bbbbbb" can expose bugs in window shrinking logic. Some incorrect implementations fail to remove duplicates properly and may incorrectly report larger window sizes.

Here, the while loop guarantees that duplicates are removed before the current character is reinserted. The window size never exceeds 1.

Duplicate Appears Inside Current Window

Cases such as "abba" are tricky because removing only one character is not always enough. When processing the second 'b', the window must continue shrinking until the earlier 'b' is fully removed.

The implementation correctly uses a while loop instead of a single if statement. This ensures the window is fully cleaned before continuing.

Long Unique Segment Near the End

In inputs such as "tmmzuxt", the longest substring appears late in the string rather than near the beginning.

The algorithm continuously updates max_length throughout the scan, so it correctly captures longer windows regardless of where they occur.