LeetCode 444 - Sequence Reconstruction

This problem asks us to determine whether the given array nums is both: 1. A valid shortest supersequence of all arrays in sequences 2. The only possible shortest supersequence A supersequence is a sequence that contains every sequence in sequences as a subsequence.

LeetCode Problem 444

Difficulty: 🟡 Medium
Topics: Array, Graph Theory, Topological Sort

Solution

Problem Understanding

This problem asks us to determine whether the given array nums is both:

  1. A valid shortest supersequence of all arrays in sequences
  2. The only possible shortest supersequence

A supersequence is a sequence that contains every sequence in sequences as a subsequence. Since subsequences only require relative ordering, each sequence imposes ordering constraints between elements.

For example, if one sequence is [1, 3], then any valid supersequence must place 1 before 3.

The key challenge is uniqueness. Multiple valid shortest supersequences may exist if the ordering constraints are incomplete.

Consider:

sequences = [[1,2],[1,3]]

Both [1,2,3] and [1,3,2] satisfy the constraints because nothing specifies whether 2 must come before 3 or vice versa.

The problem gives:

  • nums, a permutation of integers from 1 to n
  • sequences, a collection of subsequences derived from nums

We must return true only if:

  • nums can be reconstructed from the ordering constraints
  • no other shortest ordering is possible

The constraints are important:

  • n can be up to 10^4
  • total sequence length can be up to 10^5

This immediately rules out brute force generation of permutations or all topological orderings.

The problem naturally becomes a graph problem:

  • each number is a node
  • each adjacent pair in a sequence defines a directed edge

For example:

[1,2,3]

creates edges:

1 -> 2
2 -> 3

The question then becomes:

"Does this directed graph have exactly one valid topological ordering, and is it equal to nums?"

Several edge cases are important:

  • Missing numbers in sequences
  • Multiple valid orderings
  • Sequences too short to uniquely determine ordering
  • Duplicate edges introduced by multiple sequences
  • A valid ordering shorter than nums
  • Disconnected graph components

The problem guarantees:

  • nums is a permutation of 1..n
  • every sequence is a subsequence of nums

This means we do not need to validate whether sequences violate nums, but we still must verify uniqueness and completeness.

Approaches

Brute Force Approach

A brute force solution would attempt to generate all possible shortest supersequences that satisfy the ordering constraints imposed by sequences.

We could:

  1. Build all ordering constraints
  2. Generate every valid topological ordering
  3. Count how many exist
  4. Check whether exactly one exists and equals nums

This works because every shortest supersequence corresponds to a topological ordering of the dependency graph.

However, generating all topological orderings is extremely expensive. A graph with only a few unconstrained nodes can already produce factorially many orderings.

For example:

[1,2]
[1,3]
[1,4]

allows:

1,2,3,4
1,2,4,3
1,3,2,4
1,3,4,2
...

The number of valid orderings grows explosively.

With n = 10^4, brute force is completely infeasible.

Key Insight

The crucial observation is:

A directed acyclic graph has a unique topological ordering if and only if at every step of topological sorting there is exactly one node with indegree 0.

This transforms the problem into a standard topological sort with an additional uniqueness check.

We build the dependency graph from sequences and perform Kahn's algorithm:

  • maintain indegrees
  • repeatedly remove nodes with indegree 0

If at any moment more than one node has indegree 0, multiple valid orderings exist, so reconstruction is not unique.

We also verify that the produced ordering exactly matches nums.

This gives an efficient linear-time solution.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n!) O(n!) Generates all valid topological orderings
Optimal O(V + E) O(V + E) Uses topological sort with uniqueness checking

Here:

  • V = n
  • E is the number of ordering relations

Since total sequence length is at most 10^5, the optimal solution easily fits constraints.

Algorithm Walkthrough

1. Build the graph

We create:

  • an adjacency list for directed edges
  • an indegree array to count incoming edges

For every adjacent pair (a, b) in each sequence:

a must come before b

so we add:

a -> b

We use a set inside the adjacency list to avoid duplicate edges, since repeated edges would incorrectly inflate indegrees.

2. Initialize indegrees

Every time we add an edge:

a -> b

we increment:

indegree[b]

The indegree tells us how many prerequisites a node still has.

3. Find all nodes with indegree 0

These are nodes that currently have no unmet dependencies and can legally appear next in the ordering.

We place them into a queue.

4. Enforce uniqueness during topological sort

At every iteration:

  • if the queue contains more than one node, multiple valid choices exist
  • therefore multiple topological orderings exist
  • therefore reconstruction is not unique

So we immediately return false.

This is the core uniqueness condition.

5. Compare against nums

When we remove a node from the queue, it must match the corresponding position in nums.

If it does not, then even if the graph has a unique ordering, it is not the ordering given by nums.

6. Remove outgoing edges

For the current node:

  • decrement indegrees of all neighbors
  • if a neighbor's indegree becomes 0, add it to the queue

This simulates satisfying dependencies.

7. Verify full reconstruction

At the end:

  • if we processed exactly n nodes, reconstruction succeeded
  • otherwise some nodes were missing or unreachable

Return whether the processed count equals n.

Why it works

Kahn's topological sort always produces a valid topological ordering for a DAG.

The uniqueness property comes from the fact that if two or more nodes simultaneously have indegree 0, we may choose either one first, producing different valid orderings.

Therefore:

  • exactly one node with indegree 0 at every step implies a unique ordering
  • matching every extracted node against nums guarantees the unique ordering equals nums

Thus the algorithm correctly determines whether nums is the unique shortest supersequence.

Python Solution

from collections import defaultdict, deque
from typing import List

class Solution:
    def sequenceReconstruction(self, nums: List[int], sequences: List[List[int]]) -> bool:
        n = len(nums)

        graph = defaultdict(set)
        indegree = [0] * (n + 1)

        seen = set()

        for sequence in sequences:
            for num in sequence:
                seen.add(num)

            for i in range(len(sequence) - 1):
                u = sequence[i]
                v = sequence[i + 1]

                if v not in graph[u]:
                    graph[u].add(v)
                    indegree[v] += 1

        if len(seen) != n:
            return False

        queue = deque()

        for num in range(1, n + 1):
            if indegree[num] == 0:
                queue.append(num)

        index = 0

        while queue:
            if len(queue) > 1:
                return False

            current = queue.popleft()

            if nums[index] != current:
                return False

            index += 1

            for neighbor in graph[current]:
                indegree[neighbor] -= 1

                if indegree[neighbor] == 0:
                    queue.append(neighbor)

        return index == n

The implementation begins by constructing the directed graph and indegree array.

The adjacency list uses defaultdict(set) rather than a list because duplicate edges must be ignored. If we allowed duplicates, indegrees could become artificially large and prevent valid nodes from entering the queue.

The seen set tracks whether every number from 1 to n appears somewhere in sequences. If a number never appears, there is insufficient information to reconstruct nums, so we immediately return False.

Next, we initialize the queue with all nodes having indegree 0.

The main loop performs Kahn's topological sort. Before processing a node, we check whether multiple choices exist in the queue. If so, the ordering is not unique.

We also compare each extracted node against the corresponding value in nums. This ensures the unique ordering exactly matches the target permutation.

Finally, after processing all reachable nodes, we verify that exactly n nodes were reconstructed.

Go Solution

package main

import "container/list"

func sequenceReconstruction(nums []int, sequences [][]int) bool {
	n := len(nums)

	graph := make([]map[int]bool, n+1)
	indegree := make([]int, n+1)

	seen := make(map[int]bool)

	for _, seq := range sequences {
		for _, num := range seq {
			seen[num] = true
		}

		for i := 0; i < len(seq)-1; i++ {
			u := seq[i]
			v := seq[i+1]

			if graph[u] == nil {
				graph[u] = make(map[int]bool)
			}

			if !graph[u][v] {
				graph[u][v] = true
				indegree[v]++
			}
		}
	}

	if len(seen) != n {
		return false
	}

	queue := list.New()

	for i := 1; i <= n; i++ {
		if indegree[i] == 0 {
			queue.PushBack(i)
		}
	}

	index := 0

	for queue.Len() > 0 {
		if queue.Len() > 1 {
			return false
		}

		front := queue.Front()
		current := front.Value.(int)
		queue.Remove(front)

		if nums[index] != current {
			return false
		}

		index++

		for neighbor := range graph[current] {
			indegree[neighbor]--

			if indegree[neighbor] == 0 {
				queue.PushBack(neighbor)
			}
		}
	}

	return index == n
}

The Go implementation mirrors the Python solution closely.

Since Go does not have a built-in deque, we use container/list as the queue structure.

The adjacency list is implemented as:

[]map[int]bool

This allows efficient duplicate-edge detection.

Go slices are initialized with zero values automatically, so the indegree array starts correctly at all zeros.

No integer overflow concerns exist because all values are within problem constraints.

Worked Examples

Example 1

nums = [1,2,3]
sequences = [[1,2],[1,3]]

Graph Construction

Edges:

1 -> 2
1 -> 3

Indegrees:

Node Indegree
1 0
2 1
3 1

Initial queue:

[1]

Topological Sort

Step Queue Current Result
1 [1] 1 valid

After removing 1:

  • indegree[2] becomes 0
  • indegree[3] becomes 0

Queue becomes:

[2,3]

Now two choices exist.

Therefore multiple valid orderings exist:

[1,2,3]
[1,3,2]

Return:

false

Example 2

nums = [1,2,3]
sequences = [[1,2]]

Graph:

1 -> 2

Node 3 never appears in any constraint.

Initial indegrees:

Node Indegree
1 0
2 1
3 0

Initial queue:

[1,3]

Multiple nodes have indegree 0.

Therefore ordering is not unique.

Also, [1,2] itself is a shorter valid supersequence.

Return:

false

Example 3

nums = [1,2,3]
sequences = [[1,2],[1,3],[2,3]]

Edges:

1 -> 2
1 -> 3
2 -> 3

Indegrees:

Node Indegree
1 0
2 1
3 2

Initial queue:

[1]

Process 1:

Node Updated Indegree
2 0
3 1

Queue:

[2]

Process 2:

Node Updated Indegree
3 0

Queue:

[3]

Only one choice existed at every step.

Ordering reconstructed:

[1,2,3]

Return:

true

Complexity Analysis

Measure Complexity Explanation
Time O(V + E) Every node and edge is processed once
Space O(V + E) Graph, indegree array, and queue storage

The graph construction processes every adjacent pair in all sequences exactly once.

Kahn's algorithm also processes every node and edge once:

  • each node enters the queue once
  • each edge is relaxed once

Since total sequence length is bounded by 10^5, the solution runs efficiently within constraints.

Test Cases

sol = Solution()

# Provided examples
assert sol.sequenceReconstruction(
    [1,2,3],
    [[1,2],[1,3]]
) == False  # multiple valid orderings

assert sol.sequenceReconstruction(
    [1,2,3],
    [[1,2]]
) == False  # nums is not shortest

assert sol.sequenceReconstruction(
    [1,2,3],
    [[1,2],[1,3],[2,3]]
) == True  # unique reconstruction

# Single element
assert sol.sequenceReconstruction(
    [1],
    [[1]]
) == True  # trivial reconstruction

# Missing node
assert sol.sequenceReconstruction(
    [1,2,3],
    [[1,2]]
) == False  # node 3 missing

# Duplicate relations across sequences
assert sol.sequenceReconstruction(
    [1,2,3],
    [[1,2],[1,2],[2,3]]
) == True  # duplicate edges handled correctly

# Multiple valid orderings
assert sol.sequenceReconstruction(
    [1,2,3,4],
    [[1,2],[1,3],[1,4]]
) == False  # 2,3,4 can appear in any order

# Fully constrained ordering
assert sol.sequenceReconstruction(
    [1,2,3,4],
    [[1,2],[2,3],[3,4]]
) == True  # exact chain

# Disconnected graph
assert sol.sequenceReconstruction(
    [1,2,3,4],
    [[1,2],[3,4]]
) == False  # disconnected components

# Long chain
assert sol.sequenceReconstruction(
    [1,2,3,4,5],
    [[1,2],[2,3],[3,4],[4,5]]
) == True  # unique long ordering

Test Summary

Test Why
[[1,2],[1,3]] Verifies detection of multiple orderings
[[1,2]] Verifies shortest-supersequence condition
[[1,2],[1,3],[2,3]] Verifies successful unique reconstruction
Single element Smallest valid input
Missing node Ensures incomplete information fails
Duplicate edges Confirms duplicate constraints do not break indegrees
Multiple branches Tests ambiguity detection
Full chain Tests uniquely constrained graph
Disconnected graph Ensures disconnected components fail
Long chain Validates scalability and correctness

Edge Cases

One important edge case occurs when some numbers from nums never appear anywhere in sequences. In that situation, there is insufficient information to force those numbers into a unique position. For example:

nums = [1,2,3]
sequences = [[1,2]]

Number 3 never appears, so multiple shortest supersequences are possible. The implementation handles this by tracking all seen numbers and verifying that exactly n distinct numbers appear.

Another subtle edge case involves duplicate edges. Multiple sequences may contain the same adjacent relationship:

[[1,2],[1,2],[2,3]]

If duplicate edges were added repeatedly, indegrees would become incorrect. Node 2 could incorrectly receive indegree 2 instead of 1. The solution avoids this by storing neighbors in a set and only incrementing indegree when a new edge is first added.

A third important edge case is disconnected components:

[[1,2],[3,4]]

Both [1,2,3,4] and [3,4,1,2] satisfy the constraints. Since multiple components have indegree 0 simultaneously, multiple valid orderings exist. The queue-size check correctly detects this ambiguity and returns False.

Another tricky scenario occurs when the graph produces a unique ordering that does not equal nums. Even if reconstruction is unique, it must exactly match the provided permutation. The implementation validates this by comparing each node removed from the queue against the corresponding index in nums.