LeetCode 3 - Longest Substring Without Repeating Characters
The problem gives a string s and asks for the length of the longest substring that contains no repeated characters. A substring is a continuous section of the string. This detail matters because characters must remain adjacent.
Difficulty: 🟡 Medium
Topics: Hash Table, String, Sliding Window
Solution
Longest Substring Without Repeating Characters
Problem Understanding
The problem gives a string s and asks for the length of the longest substring that contains no repeated characters.
A substring is a continuous section of the string. This detail matters because characters must remain adjacent. For example, in "pwwkew", the sequence "pwke" is not valid because those characters are not contiguous in the original string.
The input is a single string that may contain letters, digits, symbols, and spaces. The output is a single integer representing the maximum length of any substring where every character appears at most once.
The constraints tell us that the string length can be as large as 5 * 10^4. A quadratic solution may become too slow at this scale. For example, an O(n^2) solution would potentially perform billions of operations in the worst case. This strongly suggests that we need a linear or near linear solution.
Several edge cases are important to think about early:
- The string may be empty. In that case, the answer should be
0. - The string may contain only one repeated character, such as
"bbbb". The answer is1. - The longest valid substring may appear in the middle of the string rather than at the beginning.
- Duplicate characters may appear very close together or very far apart.
- The string may contain spaces or symbols, which should be treated exactly like normal characters.
The key challenge is efficiently maintaining a substring that contains only unique characters while scanning through the string.
Approaches
Brute Force Approach
A straightforward solution is to generate every possible substring and check whether it contains duplicate characters.
For each starting index, we can extend the substring character by character while storing seen characters in a set. If we encounter a duplicate, we stop extending that substring and move to the next starting position.
This works because it explicitly checks all possible substrings and records the maximum valid length.
However, this approach is too slow for large inputs. There are O(n^2) possible substrings, and checking uniqueness may also require additional work. Even with optimizations, the total runtime becomes quadratic.
For a string length of 50,000, a quadratic solution is not practical.
Optimal Approach, Sliding Window
The key observation is that we do not need to restart from scratch every time we encounter a duplicate character.
Suppose we already know that a substring is valid. When we extend it by one character, there are only two possibilities:
- The new character is not already inside the current substring, so the window remains valid.
- The new character already exists in the current substring, so we must shrink the window from the left until the duplicate is removed.
This leads naturally to the sliding window technique.
We maintain two pointers:
left, the start of the current windowright, the end of the current window
We also maintain a hash set or hash map to track which characters are currently inside the window.
As we move right forward through the string, we expand the window. Whenever a duplicate appears, we move left forward until the window becomes valid again.
Each character is added and removed at most once, which gives a linear time solution.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(min(n, charset)) | Checks every possible substring |
| Optimal | O(n) | O(min(n, charset)) | Sliding window with hash set |
Algorithm Walkthrough
- Initialize two pointers,
leftandright, both starting at0. These pointers define the current sliding window. - Create a hash set called
seen_characters. This set stores all characters currently inside the window. A hash set is used because membership checks, insertions, and removals are allO(1)on average. - Initialize a variable
max_lengthto0. This variable tracks the longest valid substring found so far. - Iterate through the string using the
rightpointer. - For each character at position
right, check whether it already exists inseen_characters. - If the character already exists, the current window is invalid because it contains duplicates. Move the
leftpointer forward while removing characters from the set until the duplicate character is removed. - Once the duplicate is removed, add the current character to the set.
- Compute the current window length using:
$\text{window length} = right - left + 1$
9. Update max_length if the current window is larger than the previous best answer.
10. Continue until the right pointer reaches the end of the string.
Why it works
At every step, the sliding window maintains an important invariant:
The substring between
leftandrightalways contains unique characters.
Whenever a duplicate appears, the algorithm shrinks the window just enough to restore validity. Because both pointers only move forward, every character is processed at most twice, once when entering the window and once when leaving it. This guarantees both correctness and linear runtime.
Python Solution
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
seen_characters = set()
left = 0
max_length = 0
for right in range(len(s)):
while s[right] in seen_characters:
seen_characters.remove(s[left])
left += 1
seen_characters.add(s[right])
current_length = right - left + 1
max_length = max(max_length, current_length)
return max_length
The implementation follows the sliding window algorithm directly.
The seen_characters set tracks which characters currently exist inside the active window. The left pointer marks the beginning of the substring, while the loop variable right expands the window one character at a time.
Whenever the character at right already exists inside the set, the while loop removes characters from the left side until the duplicate disappears. This guarantees that the window always contains unique characters.
After the window becomes valid, the current character is inserted into the set. The window length is then computed and compared against max_length.
The solution handles all edge cases naturally. If the string is empty, the loop never executes and the function correctly returns 0.
Go Solution
func lengthOfLongestSubstring(s string) int {
seenCharacters := make(map[byte]bool)
left := 0
maxLength := 0
for right := 0; right < len(s); right++ {
for seenCharacters[s[right]] {
delete(seenCharacters, s[left])
left++
}
seenCharacters[s[right]] = true
currentLength := right - left + 1
if currentLength > maxLength {
maxLength = currentLength
}
}
return maxLength
}
The Go implementation uses a map[byte]bool instead of a Python set. The logic remains the same.
Go strings are byte slices, so indexing returns a byte. Since the problem uses standard ASCII style characters, byte based indexing works correctly here.
The delete function removes characters from the map when the left side of the window moves forward.
Unlike Python, Go does not have a built in max function for integers, so the comparison is written explicitly using an if statement.
Worked Examples
Example 1
Input:
s = "abcabcbb"
| Step | right | Character | left | Window | Seen Characters | max_length |
|---|---|---|---|---|---|---|
| 1 | 0 | a | 0 | "a" | {a} | 1 |
| 2 | 1 | b | 0 | "ab" | {a, b} | 2 |
| 3 | 2 | c | 0 | "abc" | {a, b, c} | 3 |
| 4 | 3 | a | 1 | "bca" | {b, c, a} | 3 |
| 5 | 4 | b | 2 | "cab" | {c, a, b} | 3 |
| 6 | 5 | c | 3 | "abc" | {a, b, c} | 3 |
| 7 | 6 | b | 5 | "cb" | {c, b} | 3 |
| 8 | 7 | b | 7 | "b" | {b} | 3 |
Final answer: 3
Example 2
Input:
s = "bbbbb"
| Step | right | Character | left | Window | Seen Characters | max_length |
|---|---|---|---|---|---|---|
| 1 | 0 | b | 0 | "b" | {b} | 1 |
| 2 | 1 | b | 1 | "b" | {b} | 1 |
| 3 | 2 | b | 2 | "b" | {b} | 1 |
| 4 | 3 | b | 3 | "b" | {b} | 1 |
| 5 | 4 | b | 4 | "b" | {b} | 1 |
Final answer: 1
Example 3
Input:
s = "pwwkew"
| Step | right | Character | left | Window | Seen Characters | max_length |
|---|---|---|---|---|---|---|
| 1 | 0 | p | 0 | "p" | {p} | 1 |
| 2 | 1 | w | 0 | "pw" | {p, w} | 2 |
| 3 | 2 | w | 2 | "w" | {w} | 2 |
| 4 | 3 | k | 2 | "wk" | {w, k} | 2 |
| 5 | 4 | e | 2 | "wke" | {w, k, e} | 3 |
| 6 | 5 | w | 3 | "kew" | {k, e, w} | 3 |
Final answer: 3
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each character enters and leaves the window at most once |
| Space | O(min(n, charset)) | The hash set stores unique characters currently in the window |
The runtime is linear because both pointers move only forward through the string. Even though there is a nested while loop, the total number of left pointer movements across the entire algorithm is at most n.
The space complexity depends on how many unique characters can exist simultaneously inside the sliding window. In the worst case, all characters are distinct.
Test Cases
solution = Solution()
assert solution.lengthOfLongestSubstring("") == 0
# Empty string
assert solution.lengthOfLongestSubstring("a") == 1
# Single character
assert solution.lengthOfLongestSubstring("bbbbb") == 1
# All characters identical
assert solution.lengthOfLongestSubstring("abcabcbb") == 3
# Repeating pattern
assert solution.lengthOfLongestSubstring("pwwkew") == 3
# Duplicate inside sliding window
assert solution.lengthOfLongestSubstring("abcdef") == 6
# All unique characters
assert solution.lengthOfLongestSubstring("abba") == 2
# Duplicate forces multiple left shifts
assert solution.lengthOfLongestSubstring("dvdf") == 3
# Window shrinks and expands correctly
assert solution.lengthOfLongestSubstring(" ") == 1
# Single space character
assert solution.lengthOfLongestSubstring("aab") == 2
# Duplicate at beginning
assert solution.lengthOfLongestSubstring("tmmzuxt") == 5
# Longest substring appears near the end
assert solution.lengthOfLongestSubstring("anviaj") == 5
# Complex overlapping duplicates
assert solution.lengthOfLongestSubstring("abcdefghijklmnopqrstuvwxyz") == 26
# Large unique substring
| Test | Why |
|---|---|
"" |
Validates empty input handling |
"a" |
Smallest non empty string |
"bbbbb" |
All characters repeated |
"abcabcbb" |
Standard repeating pattern |
"pwwkew" |
Validates true substring handling |
"abcdef" |
Entire string is valid |
"abba" |
Tests repeated character inside active window |
"dvdf" |
Ensures shrinking logic works correctly |
" " |
Confirms spaces are treated as characters |
"aab" |
Duplicate at the beginning |
"tmmzuxt" |
Longest substring appears later |
"anviaj" |
Overlapping duplicate regions |
"abcdefghijklmnopqrstuvwxyz" |
Large fully unique substring |
Edge Cases
Empty String
An empty string is an important boundary case because there are no substrings at all. A buggy implementation might incorrectly return 1 or attempt invalid indexing operations.
This implementation handles the case naturally. The loop never executes, max_length remains 0, and the correct answer is returned.
All Characters Repeated
A string like "bbbbbb" can expose bugs in window shrinking logic. Some incorrect implementations fail to remove duplicates properly and may incorrectly report larger window sizes.
Here, the while loop guarantees that duplicates are removed before the current character is reinserted. The window size never exceeds 1.
Duplicate Appears Inside Current Window
Cases such as "abba" are tricky because removing only one character is not always enough. When processing the second 'b', the window must continue shrinking until the earlier 'b' is fully removed.
The implementation correctly uses a while loop instead of a single if statement. This ensures the window is fully cleaned before continuing.
Long Unique Segment Near the End
In inputs such as "tmmzuxt", the longest substring appears late in the string rather than near the beginning.
The algorithm continuously updates max_length throughout the scan, so it correctly captures longer windows regardless of where they occur.