LeetCode 352 - Data Stream as Disjoint Intervals

This problem asks us to process a stream of non-negative integers and continuously maintain a compact representation of all numbers seen so far. Instead of storing every individual number separately, we want to group consecutive numbers into disjoint intervals.

LeetCode Problem 352

Difficulty: 🔴 Hard
Topics: Hash Table, Binary Search, Union-Find, Design, Data Stream, Ordered Set

Solution

LeetCode 352 - Data Stream as Disjoint Intervals

Problem Understanding

This problem asks us to process a stream of non-negative integers and continuously maintain a compact representation of all numbers seen so far.

Instead of storing every individual number separately, we want to group consecutive numbers into disjoint intervals. An interval [start, end] represents every integer from start through end, inclusive.

For example, if the stream contains:

1, 2, 3, 7

then the summarized intervals are:

[[1, 3], [7, 7]]

because 1, 2, 3 form one consecutive block, while 7 stands alone.

The class must support two operations:

  1. addNum(value)

Adds a number into the data stream. 2. getIntervals()

Returns all disjoint intervals sorted by starting value.

The important detail is that numbers may arrive in arbitrary order. Adding a number can create several different situations:

  • It may already exist
  • It may create a new isolated interval
  • It may extend an existing interval
  • It may merge two intervals together

For example:

Current intervals: [[1, 1], [3, 3]]
Add 2
Result: [[1, 3]]

Here, 2 bridges the gap between two intervals and merges them.

The constraints are relatively small:

  • 0 <= value <= 10^4
  • At most 3 * 10^4 operations total
  • At most 10^2 calls to getIntervals

These constraints suggest that a carefully designed ordered structure is sufficient. We do not need extremely advanced balancing structures, but we do need efficient interval merging.

Several edge cases are important:

  • Adding duplicate values should not change anything
  • Adding values adjacent to existing intervals must merge correctly
  • Adding a value that bridges two intervals must combine them into one
  • Adding values at the beginning or end of existing ranges must update boundaries properly
  • The returned intervals must always remain sorted and non-overlapping

Approaches

Brute Force Approach

A straightforward solution is to store every number individually in a set.

Whenever getIntervals() is called:

  1. Extract all numbers
  2. Sort them
  3. Scan through the sorted list
  4. Merge consecutive numbers into intervals

This works because consecutive numbers naturally form intervals during a linear scan.

For example:

Numbers: {1, 2, 3, 6, 7}
Sorted: [1, 2, 3, 6, 7]
Intervals: [[1, 3], [6, 7]]

The approach is easy to implement and always correct. However, repeatedly sorting and rebuilding intervals becomes inefficient when the stream grows large.

If there are many getIntervals() calls, we repeatedly redo work that was already computed earlier.

Optimal Approach

The key observation is that intervals change only locally when inserting a new number.

When adding value, only neighboring intervals matter:

  • An interval ending at value - 1
  • An interval starting at value + 1

Everything else remains unchanged.

Therefore, instead of rebuilding intervals from scratch every time, we can maintain the intervals incrementally as numbers arrive.

We store intervals in sorted order and merge them immediately during insertion.

This gives us efficient updates and makes getIntervals() trivial because the intervals are always maintained in valid form.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force addNum: O(1), getIntervals: O(n log n) O(n) Rebuilds intervals from sorted numbers each time
Optimal addNum: O(k), getIntervals: O(k) O(k) Maintains merged intervals incrementally

Here, k is the number of disjoint intervals.

Algorithm Walkthrough

We maintain a sorted list of intervals.

Each interval is represented as:

[start, end]

The list is always:

  • Sorted by start
  • Non-overlapping
  • Fully merged

Step 1: Find the insertion position

When inserting a new value, we scan intervals until we find where the value belongs.

We skip all intervals whose end is strictly smaller than value - 1.

These intervals are completely unrelated to the new number.

Step 2: Handle overlap or duplication

If the value already lies inside an interval, we do nothing.

Example:

Intervals: [[1, 5]]
Add 3

Since 3 already belongs to [1, 5], no update is needed.

Step 3: Merge adjacent intervals

If the value touches neighboring intervals, we merge them.

There are several possibilities:

Extend left interval

[[1, 3]]
Add 4

becomes:

[[1, 4]]

Extend right interval

[[5, 7]]
Add 4

becomes:

[[4, 7]]

Bridge two intervals

[[1, 2], [4, 5]]
Add 3

becomes:

[[1, 5]]

This is the most important case.

Step 4: Insert isolated interval

If no merge is possible, insert a new interval:

[[1, 2], [6, 7]]
Add 4

becomes:

[[1, 2], [4, 4], [6, 7]]

Step 5: Return intervals

Because intervals are maintained incrementally, getIntervals() simply returns the stored intervals.

Why it works

The algorithm maintains the invariant that intervals are always sorted, disjoint, and fully merged.

Every insertion examines only neighboring intervals because only adjacent ranges can possibly change after inserting one value. By immediately merging intervals whenever adjacency occurs, we guarantee that no two intervals remain mergeable afterward.

Therefore, the interval list always represents the exact set of inserted numbers.

Python Solution

from typing import List

class SummaryRanges:

    def __init__(self):
        self.intervals = []

    def addNum(self, value: int) -> None:
        new_interval = [value, value]
        result = []
        inserted = False

        for start, end in self.intervals:

            # Current interval is completely before new interval
            if end + 1 < new_interval[0]:
                result.append([start, end])

            # Current interval is completely after new interval
            elif new_interval[1] + 1 < start:
                if not inserted:
                    result.append(new_interval)
                    inserted = True

                result.append([start, end])

            # Intervals overlap or touch, merge them
            else:
                new_interval[0] = min(new_interval[0], start)
                new_interval[1] = max(new_interval[1], end)

        if not inserted:
            result.append(new_interval)

        self.intervals = result

    def getIntervals(self) -> List[List[int]]:
        return self.intervals

The implementation maintains a sorted interval list at all times.

Inside addNum, we create a temporary interval [value, value]. As we scan existing intervals, we classify each one into one of three categories.

If the current interval lies completely before the new interval, we simply copy it into the result.

If the current interval lies completely after the new interval, we insert the new interval first if it has not yet been inserted.

Otherwise, the intervals overlap or touch, so we merge them by updating the boundaries of new_interval.

At the end of the scan, if the new interval was never inserted, we append it.

Because merging happens during insertion, the interval list always stays valid.

The getIntervals() method is simple because all maintenance work has already been done incrementally.

Go Solution

type SummaryRanges struct {
	intervals [][]int
}

func Constructor() SummaryRanges {
	return SummaryRanges{
		intervals: [][]int{},
	}
}

func (this *SummaryRanges) AddNum(value int) {
	newInterval := []int{value, value}
	result := [][]int{}
	inserted := false

	for _, interval := range this.intervals {
		start := interval[0]
		end := interval[1]

		// Current interval is completely before new interval
		if end+1 < newInterval[0] {
			result = append(result, []int{start, end})

		// Current interval is completely after new interval
		} else if newInterval[1]+1 < start {
			if !inserted {
				result = append(result, newInterval)
				inserted = true
			}

			result = append(result, []int{start, end})

		// Overlapping or adjacent intervals
		} else {
			if start < newInterval[0] {
				newInterval[0] = start
			}

			if end > newInterval[1] {
				newInterval[1] = end
			}
		}
	}

	if !inserted {
		result = append(result, newInterval)
	}

	this.intervals = result
}

func (this *SummaryRanges) GetIntervals() [][]int {
	return this.intervals
}

/**
 * Your SummaryRanges object will be instantiated and called as such:
 * obj := Constructor();
 * obj.AddNum(value);
 * param_2 := obj.GetIntervals();
 */

The Go implementation closely mirrors the Python logic.

One important difference is slice handling. In Go, intervals are stored as [][]int, and we explicitly create new slices when appending intervals into the result.

Go also distinguishes between nil slices and empty slices, but both work correctly here because LeetCode accepts either representation for empty results.

Integer overflow is not a concern because values are bounded by 10^4.

Worked Examples

Example 1

Operations:

addNum(1)
Step Intervals
Start []
Insert [1,1] [[1,1]]

Result:

[[1,1]]

Add 3

addNum(3)
Step Intervals
Current [[1,1]]
3 is separate [[1,1],[3,3]]

Result:

[[1,1],[3,3]]

Add 7

addNum(7)
Step Intervals
Current [[1,1],[3,3]]
Insert separate interval [[1,1],[3,3],[7,7]]

Result:

[[1,1],[3,3],[7,7]]

Add 2

addNum(2)

Initial intervals:

[[1,1],[3,3],[7,7]]

Processing:

Current Interval Action New Interval
[1,1] Adjacent to 2 [1,2]
[3,3] Adjacent to merged interval [1,3]
[7,7] Separate unchanged

Final intervals:

[[1,3],[7,7]]

Add 6

addNum(6)

Initial:

[[1,3],[7,7]]

Processing:

Current Interval Action
[1,3] Separate
[7,7] Adjacent to 6, merge

Final:

[[1,3],[6,7]]

Complexity Analysis

Measure Complexity Explanation
Time O(k) per addNum We scan all current intervals once
Space O(k) We store disjoint intervals

Here, k is the number of disjoint intervals.

The complexity depends on the number of intervals rather than the total number of inserted values. This is important because the follow-up specifically highlights cases where many values merge into a small number of intervals.

If most numbers become part of large merged ranges, k remains small, making the solution efficient in practice.

Test Cases

from typing import List

class SummaryRanges:

    def __init__(self):
        self.intervals = []

    def addNum(self, value: int) -> None:
        new_interval = [value, value]
        result = []
        inserted = False

        for start, end in self.intervals:

            if end + 1 < new_interval[0]:
                result.append([start, end])

            elif new_interval[1] + 1 < start:
                if not inserted:
                    result.append(new_interval)
                    inserted = True

                result.append([start, end])

            else:
                new_interval[0] = min(new_interval[0], start)
                new_interval[1] = max(new_interval[1], end)

        if not inserted:
            result.append(new_interval)

        self.intervals = result

    def getIntervals(self) -> List[List[int]]:
        return self.intervals

# Example from problem statement
sr = SummaryRanges()
sr.addNum(1)
assert sr.getIntervals() == [[1, 1]]  # single value

sr.addNum(3)
assert sr.getIntervals() == [[1, 1], [3, 3]]  # disjoint intervals

sr.addNum(7)
assert sr.getIntervals() == [[1, 1], [3, 3], [7, 7]]  # another isolated value

sr.addNum(2)
assert sr.getIntervals() == [[1, 3], [7, 7]]  # bridge merge

sr.addNum(6)
assert sr.getIntervals() == [[1, 3], [6, 7]]  # extend interval

# Duplicate insertion
sr = SummaryRanges()
sr.addNum(5)
sr.addNum(5)
assert sr.getIntervals() == [[5, 5]]  # duplicate ignored

# Sequential inserts
sr = SummaryRanges()
for x in [1, 2, 3, 4, 5]:
    sr.addNum(x)

assert sr.getIntervals() == [[1, 5]]  # all merged

# Reverse order insertion
sr = SummaryRanges()
for x in [5, 4, 3, 2, 1]:
    sr.addNum(x)

assert sr.getIntervals() == [[1, 5]]  # order independent

# Separate intervals
sr = SummaryRanges()
for x in [1, 10, 20]:
    sr.addNum(x)

assert sr.getIntervals() == [[1, 1], [10, 10], [20, 20]]

# Merge two large intervals
sr = SummaryRanges()
for x in [1, 2, 3, 7, 8, 9]:
    sr.addNum(x)

sr.addNum(5)
assert sr.getIntervals() == [[1, 3], [5, 5], [7, 9]]

sr.addNum(4)
sr.addNum(6)

assert sr.getIntervals() == [[1, 9]]  # full bridge merge

# Boundary values
sr = SummaryRanges()
sr.addNum(0)
sr.addNum(10000)

assert sr.getIntervals() == [[0, 0], [10000, 10000]]

# Empty structure
sr = SummaryRanges()
assert sr.getIntervals() == []  # no intervals
Test Why
Problem example Validates all core operations
Duplicate insertion Ensures duplicates do not create extra intervals
Sequential inserts Tests repeated interval extension
Reverse insertion Verifies insertion order independence
Separate intervals Confirms isolated ranges remain distinct
Bridge merge Tests merging two intervals through a connecting value
Boundary values Validates handling of minimum and maximum allowed inputs
Empty structure Ensures correct behavior before insertions

Edge Cases

Duplicate Values

One subtle issue is inserting the same value multiple times.

For example:

Add 5
Add 5

A buggy implementation might incorrectly create duplicate intervals or split ranges.

This solution handles duplicates naturally because overlapping intervals are merged. If the value already exists inside an interval, the merge operation leaves the interval unchanged.

Bridging Two Intervals

The most important edge case is when a single value connects two previously separate intervals.

Example:

[[1,2], [4,5]]
Add 3

The correct result is:

[[1,5]]

Naive implementations often merge with only one side and forget the second merge.

This implementation avoids that issue because it continues scanning after merging. The growing new_interval can absorb multiple intervals during a single pass.

Insertion at the Beginning or End

Adding values smaller than all intervals or larger than all intervals can expose insertion-order bugs.

Examples:

[[5,7]]
Add 1

or

[[1,3]]
Add 10

The algorithm correctly handles both situations because intervals are inserted only when the appropriate location is found. If insertion never occurs during scanning, the interval is appended at the end.

Empty Data Stream

Calling getIntervals() before any insertions should return an empty list.

Some implementations accidentally return None or an uninitialized structure.

This solution initializes self.intervals as an empty list from the start, so the method correctly returns:

[]