LeetCode 352 - Data Stream as Disjoint Intervals
This problem asks us to process a stream of non-negative integers and continuously maintain a compact representation of all numbers seen so far. Instead of storing every individual number separately, we want to group consecutive numbers into disjoint intervals.
Difficulty: 🔴 Hard
Topics: Hash Table, Binary Search, Union-Find, Design, Data Stream, Ordered Set
Solution
LeetCode 352 - Data Stream as Disjoint Intervals
Problem Understanding
This problem asks us to process a stream of non-negative integers and continuously maintain a compact representation of all numbers seen so far.
Instead of storing every individual number separately, we want to group consecutive numbers into disjoint intervals. An interval [start, end] represents every integer from start through end, inclusive.
For example, if the stream contains:
1, 2, 3, 7
then the summarized intervals are:
[[1, 3], [7, 7]]
because 1, 2, 3 form one consecutive block, while 7 stands alone.
The class must support two operations:
addNum(value)
Adds a number into the data stream.
2. getIntervals()
Returns all disjoint intervals sorted by starting value.
The important detail is that numbers may arrive in arbitrary order. Adding a number can create several different situations:
- It may already exist
- It may create a new isolated interval
- It may extend an existing interval
- It may merge two intervals together
For example:
Current intervals: [[1, 1], [3, 3]]
Add 2
Result: [[1, 3]]
Here, 2 bridges the gap between two intervals and merges them.
The constraints are relatively small:
0 <= value <= 10^4- At most
3 * 10^4operations total - At most
10^2calls togetIntervals
These constraints suggest that a carefully designed ordered structure is sufficient. We do not need extremely advanced balancing structures, but we do need efficient interval merging.
Several edge cases are important:
- Adding duplicate values should not change anything
- Adding values adjacent to existing intervals must merge correctly
- Adding a value that bridges two intervals must combine them into one
- Adding values at the beginning or end of existing ranges must update boundaries properly
- The returned intervals must always remain sorted and non-overlapping
Approaches
Brute Force Approach
A straightforward solution is to store every number individually in a set.
Whenever getIntervals() is called:
- Extract all numbers
- Sort them
- Scan through the sorted list
- Merge consecutive numbers into intervals
This works because consecutive numbers naturally form intervals during a linear scan.
For example:
Numbers: {1, 2, 3, 6, 7}
Sorted: [1, 2, 3, 6, 7]
Intervals: [[1, 3], [6, 7]]
The approach is easy to implement and always correct. However, repeatedly sorting and rebuilding intervals becomes inefficient when the stream grows large.
If there are many getIntervals() calls, we repeatedly redo work that was already computed earlier.
Optimal Approach
The key observation is that intervals change only locally when inserting a new number.
When adding value, only neighboring intervals matter:
- An interval ending at
value - 1 - An interval starting at
value + 1
Everything else remains unchanged.
Therefore, instead of rebuilding intervals from scratch every time, we can maintain the intervals incrementally as numbers arrive.
We store intervals in sorted order and merge them immediately during insertion.
This gives us efficient updates and makes getIntervals() trivial because the intervals are always maintained in valid form.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | addNum: O(1), getIntervals: O(n log n) |
O(n) |
Rebuilds intervals from sorted numbers each time |
| Optimal | addNum: O(k), getIntervals: O(k) |
O(k) |
Maintains merged intervals incrementally |
Here, k is the number of disjoint intervals.
Algorithm Walkthrough
We maintain a sorted list of intervals.
Each interval is represented as:
[start, end]
The list is always:
- Sorted by
start - Non-overlapping
- Fully merged
Step 1: Find the insertion position
When inserting a new value, we scan intervals until we find where the value belongs.
We skip all intervals whose end is strictly smaller than value - 1.
These intervals are completely unrelated to the new number.
Step 2: Handle overlap or duplication
If the value already lies inside an interval, we do nothing.
Example:
Intervals: [[1, 5]]
Add 3
Since 3 already belongs to [1, 5], no update is needed.
Step 3: Merge adjacent intervals
If the value touches neighboring intervals, we merge them.
There are several possibilities:
Extend left interval
[[1, 3]]
Add 4
becomes:
[[1, 4]]
Extend right interval
[[5, 7]]
Add 4
becomes:
[[4, 7]]
Bridge two intervals
[[1, 2], [4, 5]]
Add 3
becomes:
[[1, 5]]
This is the most important case.
Step 4: Insert isolated interval
If no merge is possible, insert a new interval:
[[1, 2], [6, 7]]
Add 4
becomes:
[[1, 2], [4, 4], [6, 7]]
Step 5: Return intervals
Because intervals are maintained incrementally, getIntervals() simply returns the stored intervals.
Why it works
The algorithm maintains the invariant that intervals are always sorted, disjoint, and fully merged.
Every insertion examines only neighboring intervals because only adjacent ranges can possibly change after inserting one value. By immediately merging intervals whenever adjacency occurs, we guarantee that no two intervals remain mergeable afterward.
Therefore, the interval list always represents the exact set of inserted numbers.
Python Solution
from typing import List
class SummaryRanges:
def __init__(self):
self.intervals = []
def addNum(self, value: int) -> None:
new_interval = [value, value]
result = []
inserted = False
for start, end in self.intervals:
# Current interval is completely before new interval
if end + 1 < new_interval[0]:
result.append([start, end])
# Current interval is completely after new interval
elif new_interval[1] + 1 < start:
if not inserted:
result.append(new_interval)
inserted = True
result.append([start, end])
# Intervals overlap or touch, merge them
else:
new_interval[0] = min(new_interval[0], start)
new_interval[1] = max(new_interval[1], end)
if not inserted:
result.append(new_interval)
self.intervals = result
def getIntervals(self) -> List[List[int]]:
return self.intervals
The implementation maintains a sorted interval list at all times.
Inside addNum, we create a temporary interval [value, value]. As we scan existing intervals, we classify each one into one of three categories.
If the current interval lies completely before the new interval, we simply copy it into the result.
If the current interval lies completely after the new interval, we insert the new interval first if it has not yet been inserted.
Otherwise, the intervals overlap or touch, so we merge them by updating the boundaries of new_interval.
At the end of the scan, if the new interval was never inserted, we append it.
Because merging happens during insertion, the interval list always stays valid.
The getIntervals() method is simple because all maintenance work has already been done incrementally.
Go Solution
type SummaryRanges struct {
intervals [][]int
}
func Constructor() SummaryRanges {
return SummaryRanges{
intervals: [][]int{},
}
}
func (this *SummaryRanges) AddNum(value int) {
newInterval := []int{value, value}
result := [][]int{}
inserted := false
for _, interval := range this.intervals {
start := interval[0]
end := interval[1]
// Current interval is completely before new interval
if end+1 < newInterval[0] {
result = append(result, []int{start, end})
// Current interval is completely after new interval
} else if newInterval[1]+1 < start {
if !inserted {
result = append(result, newInterval)
inserted = true
}
result = append(result, []int{start, end})
// Overlapping or adjacent intervals
} else {
if start < newInterval[0] {
newInterval[0] = start
}
if end > newInterval[1] {
newInterval[1] = end
}
}
}
if !inserted {
result = append(result, newInterval)
}
this.intervals = result
}
func (this *SummaryRanges) GetIntervals() [][]int {
return this.intervals
}
/**
* Your SummaryRanges object will be instantiated and called as such:
* obj := Constructor();
* obj.AddNum(value);
* param_2 := obj.GetIntervals();
*/
The Go implementation closely mirrors the Python logic.
One important difference is slice handling. In Go, intervals are stored as [][]int, and we explicitly create new slices when appending intervals into the result.
Go also distinguishes between nil slices and empty slices, but both work correctly here because LeetCode accepts either representation for empty results.
Integer overflow is not a concern because values are bounded by 10^4.
Worked Examples
Example 1
Operations:
addNum(1)
| Step | Intervals |
|---|---|
| Start | [] |
Insert [1,1] |
[[1,1]] |
Result:
[[1,1]]
Add 3
addNum(3)
| Step | Intervals |
|---|---|
| Current | [[1,1]] |
3 is separate |
[[1,1],[3,3]] |
Result:
[[1,1],[3,3]]
Add 7
addNum(7)
| Step | Intervals |
|---|---|
| Current | [[1,1],[3,3]] |
| Insert separate interval | [[1,1],[3,3],[7,7]] |
Result:
[[1,1],[3,3],[7,7]]
Add 2
addNum(2)
Initial intervals:
[[1,1],[3,3],[7,7]]
Processing:
| Current Interval | Action | New Interval |
|---|---|---|
[1,1] |
Adjacent to 2 | [1,2] |
[3,3] |
Adjacent to merged interval | [1,3] |
[7,7] |
Separate | unchanged |
Final intervals:
[[1,3],[7,7]]
Add 6
addNum(6)
Initial:
[[1,3],[7,7]]
Processing:
| Current Interval | Action |
|---|---|
[1,3] |
Separate |
[7,7] |
Adjacent to 6, merge |
Final:
[[1,3],[6,7]]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(k) per addNum |
We scan all current intervals once |
| Space | O(k) |
We store disjoint intervals |
Here, k is the number of disjoint intervals.
The complexity depends on the number of intervals rather than the total number of inserted values. This is important because the follow-up specifically highlights cases where many values merge into a small number of intervals.
If most numbers become part of large merged ranges, k remains small, making the solution efficient in practice.
Test Cases
from typing import List
class SummaryRanges:
def __init__(self):
self.intervals = []
def addNum(self, value: int) -> None:
new_interval = [value, value]
result = []
inserted = False
for start, end in self.intervals:
if end + 1 < new_interval[0]:
result.append([start, end])
elif new_interval[1] + 1 < start:
if not inserted:
result.append(new_interval)
inserted = True
result.append([start, end])
else:
new_interval[0] = min(new_interval[0], start)
new_interval[1] = max(new_interval[1], end)
if not inserted:
result.append(new_interval)
self.intervals = result
def getIntervals(self) -> List[List[int]]:
return self.intervals
# Example from problem statement
sr = SummaryRanges()
sr.addNum(1)
assert sr.getIntervals() == [[1, 1]] # single value
sr.addNum(3)
assert sr.getIntervals() == [[1, 1], [3, 3]] # disjoint intervals
sr.addNum(7)
assert sr.getIntervals() == [[1, 1], [3, 3], [7, 7]] # another isolated value
sr.addNum(2)
assert sr.getIntervals() == [[1, 3], [7, 7]] # bridge merge
sr.addNum(6)
assert sr.getIntervals() == [[1, 3], [6, 7]] # extend interval
# Duplicate insertion
sr = SummaryRanges()
sr.addNum(5)
sr.addNum(5)
assert sr.getIntervals() == [[5, 5]] # duplicate ignored
# Sequential inserts
sr = SummaryRanges()
for x in [1, 2, 3, 4, 5]:
sr.addNum(x)
assert sr.getIntervals() == [[1, 5]] # all merged
# Reverse order insertion
sr = SummaryRanges()
for x in [5, 4, 3, 2, 1]:
sr.addNum(x)
assert sr.getIntervals() == [[1, 5]] # order independent
# Separate intervals
sr = SummaryRanges()
for x in [1, 10, 20]:
sr.addNum(x)
assert sr.getIntervals() == [[1, 1], [10, 10], [20, 20]]
# Merge two large intervals
sr = SummaryRanges()
for x in [1, 2, 3, 7, 8, 9]:
sr.addNum(x)
sr.addNum(5)
assert sr.getIntervals() == [[1, 3], [5, 5], [7, 9]]
sr.addNum(4)
sr.addNum(6)
assert sr.getIntervals() == [[1, 9]] # full bridge merge
# Boundary values
sr = SummaryRanges()
sr.addNum(0)
sr.addNum(10000)
assert sr.getIntervals() == [[0, 0], [10000, 10000]]
# Empty structure
sr = SummaryRanges()
assert sr.getIntervals() == [] # no intervals
| Test | Why |
|---|---|
| Problem example | Validates all core operations |
| Duplicate insertion | Ensures duplicates do not create extra intervals |
| Sequential inserts | Tests repeated interval extension |
| Reverse insertion | Verifies insertion order independence |
| Separate intervals | Confirms isolated ranges remain distinct |
| Bridge merge | Tests merging two intervals through a connecting value |
| Boundary values | Validates handling of minimum and maximum allowed inputs |
| Empty structure | Ensures correct behavior before insertions |
Edge Cases
Duplicate Values
One subtle issue is inserting the same value multiple times.
For example:
Add 5
Add 5
A buggy implementation might incorrectly create duplicate intervals or split ranges.
This solution handles duplicates naturally because overlapping intervals are merged. If the value already exists inside an interval, the merge operation leaves the interval unchanged.
Bridging Two Intervals
The most important edge case is when a single value connects two previously separate intervals.
Example:
[[1,2], [4,5]]
Add 3
The correct result is:
[[1,5]]
Naive implementations often merge with only one side and forget the second merge.
This implementation avoids that issue because it continues scanning after merging. The growing new_interval can absorb multiple intervals during a single pass.
Insertion at the Beginning or End
Adding values smaller than all intervals or larger than all intervals can expose insertion-order bugs.
Examples:
[[5,7]]
Add 1
or
[[1,3]]
Add 10
The algorithm correctly handles both situations because intervals are inserted only when the appropriate location is found. If insertion never occurs during scanning, the interval is appended at the end.
Empty Data Stream
Calling getIntervals() before any insertions should return an empty list.
Some implementations accidentally return None or an uninitialized structure.
This solution initializes self.intervals as an empty list from the start, so the method correctly returns:
[]