LeetCode 2876 - Count Visited Nodes in a Directed Graph

Difficulty: 🔴 Hard
Topics: Dynamic Programming, Depth-First Search, Graph Theory, Topological Sort, Memoization

Solution

Problem Understanding

The problem presents a directed graph with n nodes, where each node has exactly one outgoing edge defined by the array edges. Specifically, edges[i] indicates that there is a directed edge from node i to node edges[i]. The task is to simulate a traversal starting from each node: you keep following edges until you reach a node that has already been visited in the same traversal. For each starting node, you must determine the number of distinct nodes visited during this process and return an array answer containing these counts.

In other words, for each node, we are asked how many nodes we can reach before hitting a cycle or revisiting a node in that path. Because each node has exactly one outgoing edge, the graph consists of disjoint cycles and trees directed towards cycles. The constraints are sizable, up to 10^5 nodes, meaning any naive approach that revisits nodes multiple times will be too slow. Key points to note are that edges[i] != i so there are no self-loops and every node points somewhere, guaranteeing that every path eventually ends in a cycle.

Important edge cases include nodes that immediately point into a cycle, multiple cycles, or chains leading into a cycle. The graph may also have a "tail" sequence of nodes feeding into a cycle, and we must avoid recounting nodes multiple times to stay efficient.

Approaches

The brute-force approach would simulate the described traversal for each node independently. For each starting node, you would follow the edges and track visited nodes until reaching a node seen in the current traversal. You would then count the number of nodes visited. This is guaranteed to give the correct result, but for a graph of size n = 10^5, the worst-case time complexity is O(n^2). For example, a long chain feeding into a large cycle could require traversing most of the graph for many starting nodes, which is too slow.

The key observation for an optimal approach is that nodes along the same path leading to a cycle will share their "visited count", except for the offset depending on distance from the cycle. This naturally lends itself to memoization combined with depth-first search (DFS). By caching the number of nodes visited from each node after the first computation, we avoid recalculating the same path multiple times. The DFS approach also allows us to handle cycles correctly by marking nodes as "in process" and resolving cycle sizes to propagate counts along incoming paths.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n^2)	O(n)	Traverse each node independently, tracking visited nodes each time
Optimal	O(n)	O(n)	DFS with memoization, compute visited counts once per node, handle cycles efficiently

Algorithm Walkthrough

Initialize an array answer of size n to store the visited node counts for each node. Initialize a visited array to track the DFS state: 0 = unvisited, 1 = visiting, 2 = visited.
Define a recursive DFS function that starts from a node u. If u is fully visited (visited[u] == 2), return its stored count from answer[u].
If u is currently being visited (visited[u] == 1), we detected a cycle. Compute the cycle length by iterating from u until we return to u, store this length for all nodes in the cycle.
Mark u as visiting (visited[u] = 1) and recursively call DFS on edges[u] to compute the visited count of the next node.
Set answer[u] to 1 + answer[edges[u]], which propagates the count back to the current node.
Mark u as fully visited (visited[u] = 2) to avoid recomputation.
Repeat DFS for all nodes in the graph.
Return the answer array.

This algorithm works because each node is visited at most once, cycles are handled separately, and counts propagate backward along paths, ensuring the correct total number of distinct nodes for each start node.

Python Solution

from typing import List

class Solution:
    def countVisitedNodes(self, edges: List[int]) -> List[int]:
        n = len(edges)
        answer = [0] * n
        visited = [0] * n  # 0 = unvisited, 1 = visiting, 2 = visited
        
        def dfs(u: int) -> int:
            if visited[u] == 2:
                return answer[u]
            if visited[u] == 1:
                # Detected a cycle
                cycle_nodes = [u]
                v = edges[u]
                while v != u:
                    cycle_nodes.append(v)
                    v = edges[v]
                cycle_len = len(cycle_nodes)
                for node in cycle_nodes:
                    answer[node] = cycle_len
                    visited[node] = 2
                return answer[u]
            visited[u] = 1
            answer[u] = 1 + dfs(edges[u])
            visited[u] = 2
            return answer[u]
        
        for i in range(n):
            if visited[i] == 0:
                dfs(i)
        return answer

The implementation defines a DFS function that handles cycles by computing their length once and memoizing results for each node. Nodes outside cycles automatically accumulate counts based on distance to the cycle or the end node.

Go Solution

func countVisitedNodes(edges []int) []int {
    n := len(edges)
    answer := make([]int, n)
    visited := make([]int, n) // 0 = unvisited, 1 = visiting, 2 = visited
    
    var dfs func(int) int
    dfs = func(u int) int {
        if visited[u] == 2 {
            return answer[u]
        }
        if visited[u] == 1 {
            // cycle detected
            cycleNodes := []int{u}
            v := edges[u]
            for v != u {
                cycleNodes = append(cycleNodes, v)
                v = edges[v]
            }
            cycleLen := len(cycleNodes)
            for _, node := range cycleNodes {
                answer[node] = cycleLen
                visited[node] = 2
            }
            return answer[u]
        }
        visited[u] = 1
        answer[u] = 1 + dfs(edges[u])
        visited[u] = 2
        return answer[u]
    }
    
    for i := 0; i < n; i++ {
        if visited[i] == 0 {
            dfs(i)
        }
    }
    return answer
}

In Go, slices are used instead of lists, and the recursive DFS function is declared as a closure to capture variables. The same cycle-detection and memoization logic applies as in Python.

Worked Examples

Example 1: edges = [1,2,0,0]

Node	DFS Path	Visited Count	Comments
0	0->1->2->0	3	Cycle length = 3
1	1->2->0->1	3	Already computed via DFS from 0
2	2->0->1->2	3	Already computed
3	3->0->1->2->0	4	Includes node 3 + cycle nodes

Example 2: edges = [1,2,3,4,0]

Every node eventually reaches the single cycle 0->1->2->3->4->0. The cycle length is 5, and any node outside the cycle (none in this example) would add 1 per tail node.

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n)	Each node is visited once; DFS propagates counts efficiently with memoization
Space	O(n)	Arrays `answer` and `visited` store information per node; recursion stack depth ≤ n

The DFS ensures that each node is processed once, and cycles are resolved in linear time. No node is revisited unnecessarily.

Test Cases

# Provided examples
assert Solution().countVisitedNodes([1,2,0,0]) == [3,3,3,4]  # Example 1
assert Solution().countVisitedNodes([1,2,3,4,0]) == [5,5,5,5,5]  # Example 2

# Edge cases
assert Solution().countVisitedNodes([1,0]) == [2,2]  # Minimal cycle
assert Solution().countVisitedNodes([1,2,3,4,5,6,0]) == [7,7,7,7,7,7,7]  # Large single cycle
assert Solution().countVisitedNodes([1,2,3,4,5,0,6]) == [6,6,6,6,6,6,2]  # Tail leading to cycle + isolated node

Test	Why
`[1,2,0,0]`	Validates cycle handling and tail nodes
`[1,2,3,4,0]`	Single cycle spanning all nodes
`[1,0]`	Smallest cycle
`[1,2,3,4,5,6,0]`	Large single cycle for performance check
`[