LeetCode 1557 - Minimum Number of Vertices to Reach All Nodes
The problem is asking us to identify the minimum set of vertices in a directed acyclic graph (DAG) such that starting from any of these vertices, all other nodes in the graph are reachable.
Difficulty: π‘ Medium
Topics: Graph Theory
Solution
Problem Understanding
The problem is asking us to identify the minimum set of vertices in a directed acyclic graph (DAG) such that starting from any of these vertices, all other nodes in the graph are reachable. In other words, we want the smallest collection of nodes that can βcoverβ the entire graph via directed paths.
The input consists of an integer n representing the number of vertices labeled from 0 to n-1, and an array edges where each element [fromi, toi] denotes a directed edge from fromi to toi. The output is a list of vertices that forms the minimum set from which all other nodes are reachable.
The constraints indicate that the graph can have up to 10^5 nodes and up to 10^5 edges. This means any solution must be linear or near-linear in time, since quadratic solutions would be too slow.
Important observations and edge cases include:
- Nodes with no incoming edges must be included in the result because no other node can reach them.
- Nodes with incoming edges do not necessarily need to be included since they might be reachable from other nodes.
- The graph is guaranteed to be a DAG, so there are no cycles, simplifying reachability considerations.
- There is a guarantee that a unique solution exists.
Naive approaches that try to check reachability from every subset of vertices will fail due to the large input size.
Approaches
A brute-force approach would involve considering every subset of vertices and performing a reachability check (via BFS or DFS) to see if all nodes are covered. While this would be correct, it is exponentially slow and impractical for n up to 10^5.
The key observation for an optimal solution is that a node only needs to be included in the set if it has no incoming edges. If a node has at least one incoming edge, it is already reachable from some other node, so it does not need to be part of the minimum set. Thus, the problem reduces to finding all nodes with in-degree 0.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n * (n + m)) | O(n + m) | Check all subsets of vertices and perform reachability checks; impractical |
| Optimal | O(n + m) | O(n) | Find all nodes with in-degree 0; linear scan over edges suffices |
Algorithm Walkthrough
- Initialize an array or set
has_incomingof sizento keep track of which nodes have incoming edges. - Iterate over the list of edges. For each edge
[fromi, toi], marktoias having an incoming edge by settinghas_incoming[toi] = True. - After processing all edges, iterate over all nodes
0ton-1. Collect all nodesiwherehas_incoming[i]isFalse. - Return this collection as the result. These nodes have no incoming edges and are the minimum set from which all nodes are reachable.
Why it works: In a DAG, any node with an incoming edge is reachable from some other node. Nodes without incoming edges are the only candidates that must be included in the starting set. By collecting all such nodes, we ensure the minimum set that guarantees reachability for all nodes.
Python Solution
from typing import List
class Solution:
def findSmallestSetOfVertices(self, n: int, edges: List[List[int]]) -> List[int]:
has_incoming = [False] * n
for from_node, to_node in edges:
has_incoming[to_node] = True
result = [node for node in range(n) if not has_incoming[node]]
return result
The Python implementation initializes a boolean list to mark nodes with incoming edges. It then loops over all edges and sets the corresponding entries. Finally, a list comprehension collects all nodes with no incoming edges and returns them.
Go Solution
func findSmallestSetOfVertices(n int, edges [][]int) []int {
hasIncoming := make([]bool, n)
for _, edge := range edges {
from, to := edge[0], edge[1]
hasIncoming[to] = true
}
result := []int{}
for node := 0; node < n; node++ {
if !hasIncoming[node] {
result = append(result, node)
}
}
return result
}
The Go version mirrors the Python logic, using a boolean slice to track incoming edges. A standard slice is used to store the result, appending nodes that have no incoming edges.
Worked Examples
Example 1: n = 6, edges = [[0,1],[0,2],[2,5],[3,4],[4,2]]
- Initialize
has_incoming = [False, False, False, False, False, False] - Process edges:
[0,1]βhas_incoming[1] = True[0,2]βhas_incoming[2] = True[2,5]βhas_incoming[5] = True[3,4]βhas_incoming[4] = True[4,2]βhas_incoming[2] = True(already True)
- Final
has_incoming = [False, True, True, False, True, True] - Nodes with no incoming edges:
[0,3]β Result[0,3]
Example 2: n = 5, edges = [[0,1],[2,1],[3,1],[1,4],[2,4]]
- Initialize
has_incoming = [False, False, False, False, False] - Process edges:
[0,1]βhas_incoming[1] = True[2,1]βhas_incoming[1] = True[3,1]βhas_incoming[1] = True[1,4]βhas_incoming[4] = True[2,4]βhas_incoming[4] = True
- Final
has_incoming = [False, True, False, False, True] - Nodes with no incoming edges:
[0,2,3]β Result[0,2,3]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + m) | Loop over all edges to mark in-degrees, then loop over all nodes to collect those with no incoming edges |
| Space | O(n) | Boolean array of size n to track nodes with incoming edges |
The algorithm is linear with respect to the number of nodes plus edges, which is efficient for the problem constraints.
Test Cases
# Example cases
assert Solution().findSmallestSetOfVertices(6, [[0,1],[0,2],[2,5],[3,4],[4,2]]) == [0,3] # example 1
assert Solution().findSmallestSetOfVertices(5, [[0,1],[2,1],[3,1],[1,4],[2,4]]) == [0,2,3] # example 2
# Edge cases
assert Solution().findSmallestSetOfVertices(2, [[0,1]]) == [0] # minimal graph
assert Solution().findSmallestSetOfVertices(3, [[0,1],[1,2]]) == [0] # chain graph
assert Solution().findSmallestSetOfVertices(3, [[0,2],[1,2]]) == [0,1] # multiple roots
assert Solution().findSmallestSetOfVertices(4, [[0,1],[1,2],[2,3]]) == [0] # linear DAG
assert Solution().findSmallestSetOfVertices(4, [[0,1],[1,2],[2,3],[3,0]]) == [0,1,2,3] # would be cycle, but DAG guarantee prevents this
| Test | Why |
|---|---|
Minimal graph 2 nodes |
Validates handling smallest size and direct reachability |
| Linear DAG | Ensures only first node is selected for a chain |
| Multiple roots | Checks correct detection of multiple nodes with no incoming edges |
| Large chain | Validates proper scaling with simple linear DAG |
Edge Cases
First, the minimal graph with only two nodes ensures that the algorithm correctly identifies the single source node with no incoming edges. This prevents an off-by-one bug in the array indexing.
Second, a linear chain tests that the solution does not over-select nodes. Only the first node should be in the result even if all nodes are technically reachable sequentially.
Third, multiple independent roots check that the algorithm correctly identifies multiple nodes with no incoming edges. It ensures the collection step handles multiple sources instead of stopping after finding one. Each of these nodes is essential to cover all reachable nodes. The implementation handles these by iterating through the has_incoming array and collecting all False entries.