LeetCode 128 - Longest Consecutive Sequence
The problem asks us to find the length of the longest sequence of consecutive integers in an unsorted array. A consecutive sequence means numbers that appear one after another numerically, regardless of their position in the array.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, Union-Find
Solution
Problem Understanding
The problem asks us to find the length of the longest sequence of consecutive integers in an unsorted array. A consecutive sequence means numbers that appear one after another numerically, regardless of their position in the array. The numbers do not need to be adjacent in the input, and the array itself is not sorted.
In simpler terms, we want to determine the largest chain of integers where every number differs from the next by exactly 1. For example, if the array contains [1, 2, 3, 4], the sequence length is 4. If the numbers are scattered, such as [100, 4, 200, 1, 3, 2], we still recognize that [1, 2, 3, 4] forms a consecutive sequence of length 4.
The input is an integer array nums, which may contain duplicate values and may include negative numbers. The output should be a single integer representing the maximum length of any consecutive sequence found in the array.
The constraints are particularly important here:
0 <= nums.length <= 10^5-10^9 <= nums[i] <= 10^9
The array can contain up to 100,000 elements, meaning inefficient approaches such as nested loops or repeated sorting-based searches may become too slow. The problem explicitly requires an O(n) time solution, which immediately rules out many traditional sorting approaches with O(n log n) complexity.
Several edge cases are important to recognize early:
An empty array should return 0 because there are no sequences.
A single-element array should return 1, because any number by itself forms a consecutive sequence of length 1.
Duplicate values can appear and should not incorrectly increase sequence length. For example, [1,0,1,2] contains duplicate 1s, but the longest sequence is still [0,1,2] with length 3.
Negative integers must be handled correctly. For example, [-2,-1,0,1] forms a valid consecutive sequence of length 4.
Very large gaps between numbers should not matter. Since the array is unsorted, consecutive relationships depend only on numeric adjacency, not index positions.
Approaches
Brute Force Approach
A straightforward idea is to treat every number as a potential starting point and repeatedly check whether the next consecutive number exists.
For each number x, we would search for x + 1, then x + 2, and continue until the sequence breaks. Since the array is unsorted, each lookup requires scanning the entire array to determine whether a value exists.
For example, with [100,4,200,1,3,2], starting from 1 would require checking if 2 exists, then 3, then 4, and so on. Each lookup is expensive because it involves iterating through the array.
This approach is correct because it eventually explores every possible sequence, but it is inefficient. In the worst case, every lookup costs O(n) and happens repeatedly for many elements, resulting in O(n²) time complexity.
Key Insight for an Optimal Solution
The main observation is that we only need to start counting a sequence when we are at its beginning.
Suppose we have a sequence [1,2,3,4]. If we start expanding from every element, we repeat unnecessary work:
- Starting at
1discovers the whole sequence. - Starting at
2repeats most of the work. - Starting at
3repeats again.
Instead, we should only begin sequence expansion from numbers that do not have a predecessor.
For example:
1is a valid starting point because0does not exist.2is not a starting point because1exists.3is not a starting point because2exists.
To make existence checks efficient, we store all numbers in a hash set. A hash set provides average O(1) lookup time, allowing us to quickly determine whether a number exists.
This gives us an O(n) algorithm because each number participates in sequence expansion at most once.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Repeatedly scans the array to find consecutive values |
| Optimal | O(n) | O(n) | Uses a hash set and only expands from sequence starts |
Algorithm Walkthrough
Optimal Hash Set Approach
- Create a hash set containing every number in the input array.
We use a hash set because membership checks such as x in set take average O(1) time. This allows us to efficiently determine whether consecutive numbers exist.
2. Initialize a variable longest_streak = 0.
This variable keeps track of the maximum sequence length found so far. 3. Iterate through every number in the hash set.
We iterate over the set rather than the original array to avoid redundant processing caused by duplicates. 4. Check whether the current number is the start of a sequence.
A number x is a starting point only if x - 1 is not present in the set.
This step is critical because it avoids reprocessing sequences multiple times. 5. If the number is a sequence start, begin counting consecutive numbers.
Initialize:
current_number = xcurrent_streak = 1
Then repeatedly check whether current_number + 1 exists in the set.
6. Expand the sequence while possible.
As long as the next consecutive number exists:
- Increment
current_number - Increment
current_streak
This continues until the sequence breaks. 7. Update the maximum sequence length.
Compare current_streak with longest_streak and keep the larger value.
8. Return longest_streak after processing all numbers.
Why it works
The key invariant is that every consecutive sequence has exactly one valid starting point, namely the smallest number in that sequence. By only expanding sequences from numbers without predecessors, we guarantee that each sequence is explored exactly once.
Since every number is visited at most once during expansion, no unnecessary repeated work occurs. This ensures both correctness and linear time complexity.
Python Solution
from typing import List
class Solution:
def longestConsecutive(self, nums: List[int]) -> int:
number_set = set(nums)
longest_streak = 0
for number in number_set:
# Only start a sequence if this is the beginning
if number - 1 not in number_set:
current_number = number
current_streak = 1
while current_number + 1 in number_set:
current_number += 1
current_streak += 1
longest_streak = max(longest_streak, current_streak)
return longest_streak
The implementation begins by converting the input array into a set called number_set. This removes duplicates and enables constant-time membership checks.
The variable longest_streak tracks the best answer seen so far.
The loop iterates through every number in the set. Before attempting to build a sequence, the algorithm checks whether number - 1 exists. If it does, the current number is not the beginning of a sequence, so it is skipped.
When a valid starting point is found, the algorithm initializes a sequence counter and repeatedly checks for the next consecutive number. Each successful lookup increases the sequence length.
After fully expanding a sequence, the result is compared against longest_streak. At the end of processing, the maximum sequence length is returned.
Go Solution
func longestConsecutive(nums []int) int {
numberSet := make(map[int]bool)
for _, num := range nums {
numberSet[num] = true
}
longestStreak := 0
for num := range numberSet {
// Only start if this is the beginning
if !numberSet[num-1] {
currentNum := num
currentStreak := 1
for numberSet[currentNum+1] {
currentNum++
currentStreak++
}
if currentStreak > longestStreak {
longestStreak = currentStreak
}
}
}
return longestStreak
}
The Go implementation closely mirrors the Python solution. Since Go does not have a built-in hash set, a map[int]bool is used to simulate one.
When checking existence, Go maps return the zero value false for missing keys, which makes expressions such as numberSet[num-1] convenient for membership testing.
There is no special handling required for nil or empty slices. If nums is empty, the map remains empty, the loop never executes, and the function correctly returns 0.
Integer overflow is not a concern here because the problem constraints remain safely within Go's integer range.
Worked Examples
Example 1
Input:
nums = [100,4,200,1,3,2]
Set representation:
{100, 4, 200, 1, 3, 2}
| Current Number | Is Start? | Sequence Found | Length | Longest |
|---|---|---|---|---|
| 100 | Yes, 99 missing | [100] | 1 | 1 |
| 4 | No, 3 exists | Skip | - | 1 |
| 200 | Yes, 199 missing | [200] | 1 | 1 |
| 1 | Yes, 0 missing | [1,2,3,4] | 4 | 4 |
| 3 | No, 2 exists | Skip | - | 4 |
| 2 | No, 1 exists | Skip | - | 4 |
Output:
4
Example 2
Input:
nums = [0,3,7,2,5,8,4,6,0,1]
Set representation:
{0,1,2,3,4,5,6,7,8}
| Current Number | Is Start? | Sequence Found | Length | Longest |
|---|---|---|---|---|
| 0 | Yes | [0,1,2,3,4,5,6,7,8] | 9 | 9 |
| 1 | No | Skip | - | 9 |
| 2 | No | Skip | - | 9 |
| 3 | No | Skip | - | 9 |
The remaining values are skipped because they already belong to a previously explored sequence.
Output:
9
Example 3
Input:
nums = [1,0,1,2]
Set representation:
{0,1,2}
| Current Number | Is Start? | Sequence Found | Length | Longest |
|---|---|---|---|---|
| 1 | No | Skip | - | 0 |
| 0 | Yes | [0,1,2] | 3 | 3 |
| 2 | No | Skip | - | 3 |
Output:
3
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each number is inserted into the set once and expanded at most once |
| Space | O(n) | The hash set stores all unique numbers |
Although there is a nested while loop, the total work remains linear. Each number can only be visited once across all sequence expansions. This means the overall runtime is proportional to the number of unique elements in the input.
Test Cases
solution = Solution()
assert solution.longestConsecutive([]) == 0 # Empty input
assert solution.longestConsecutive([1]) == 1 # Single element
assert solution.longestConsecutive([100, 4, 200, 1, 3, 2]) == 4 # Example 1
assert solution.longestConsecutive([0, 3, 7, 2, 5, 8, 4, 6, 0, 1]) == 9 # Example 2
assert solution.longestConsecutive([1, 0, 1, 2]) == 3 # Example 3
assert solution.longestConsecutive([1, 2, 3, 4, 5]) == 5 # Already consecutive
assert solution.longestConsecutive([5, 4, 3, 2, 1]) == 5 # Reverse order
assert solution.longestConsecutive([10, 30, 20]) == 1 # No consecutive values
assert solution.longestConsecutive([-1, -2, -3, 0, 1]) == 5 # Negative values
assert solution.longestConsecutive([1, 2, 0, 1]) == 3 # Duplicates
assert solution.longestConsecutive([9, 1, 4, 7, 3, -1, 0, 5, 8, -1, 6]) == 7 # Mixed values
assert solution.longestConsecutive([1000000, 999999, 999998]) == 3 # Large integers
| Test | Why |
|---|---|
[] |
Validates empty input handling |
[1] |
Verifies single-element behavior |
[100,4,200,1,3,2] |
Confirms example case |
[0,3,7,2,5,8,4,6,0,1] |
Tests long sequence with duplicates |
[1,0,1,2] |
Ensures duplicates do not inflate length |
[1,2,3,4,5] |
Checks already consecutive input |
[5,4,3,2,1] |
Ensures ordering does not matter |
[10,30,20] |
Validates isolated numbers |
[-1,-2,-3,0,1] |
Confirms negative integers work |
[1,2,0,1] |
Additional duplicate handling |
[9,1,4,7,3,-1,0,5,8,-1,6] |
Tests mixed ordering and negatives |
[1000000,999999,999998] |
Validates large integer values |
Edge Cases
Empty Array
An empty input array is an important edge case because there are no numbers to process. A naive implementation might accidentally initialize the answer incorrectly or assume at least one element exists. In this implementation, the set is empty, the loop never executes, and longest_streak remains 0, which is returned correctly.
Duplicate Values
Duplicates can introduce subtle bugs if handled improperly. For example, [1,0,1,2] contains two 1s, but the longest sequence is still only [0,1,2]. By converting the input into a set, duplicates are automatically removed, preventing repeated work or inflated sequence lengths.
Negative Numbers
Since integers can be negative, the algorithm must correctly identify consecutive relationships across negative values. For example, [-3,-2,-1,0,1] forms a valid sequence of length 5. The implementation works naturally because arithmetic comparisons such as number - 1 and number + 1 behave identically for negative values.
No Consecutive Elements
Some arrays may contain completely unrelated numbers, such as [10,30,20]. In these cases, every number becomes its own sequence of length 1. The algorithm handles this correctly because each value qualifies as a starting point, but no expansions occur.
Very Long Consecutive Sequences
Inputs containing large consecutive chains, such as [1,2,3,...,100000], stress performance. A naive solution would repeatedly revisit the same elements, causing poor runtime. This implementation avoids redundancy by only expanding from sequence starts, preserving the required O(n) complexity even for maximum-sized inputs.