LeetCode 323 - Number of Connected Components in an Undirected Graph

The problem asks us to determine how many connected components exist in an undirected graph. A connected component is a group of nodes where every node can reach every other node through some path.

LeetCode Problem 323

Difficulty: 🟡 Medium
Topics: Depth-First Search, Breadth-First Search, Union-Find, Graph Theory

Solution

Problem Understanding

The problem asks us to determine how many connected components exist in an undirected graph.

A connected component is a group of nodes where every node can reach every other node through some path. If two groups of nodes have no path between them, they belong to different connected components.

The input consists of two parts:

  • n, the total number of nodes in the graph
  • edges, a list of undirected edges where each edge [a, b] connects node a and node b

The graph uses node labels from 0 to n - 1.

For example, if n = 5 and the edges are:

0 - 1 - 2

3 - 4

then there are two disconnected groups:

  • {0,1,2}
  • {3,4}

so the answer is 2.

Because the graph is undirected, an edge [a, b] means travel is possible both from a to b and from b to a.

The constraints are relatively moderate:

  • Up to 2000 nodes
  • Up to 5000 edges

This is small enough for standard graph traversal algorithms such as Depth-First Search, Breadth-First Search, or Union-Find.

Several edge cases are important to recognize early:

  • A graph with no edges at all, every node is its own connected component.
  • A graph where all nodes are connected together, the answer is 1.
  • Isolated nodes must still count as components.
  • The graph may contain multiple disconnected clusters of varying sizes.
  • The problem guarantees there are no duplicate edges and no self-loops, which simplifies traversal logic.

Approaches

Brute-Force Approach

A naive brute-force strategy would attempt to determine connectivity between every pair of nodes independently.

For each node, we could repeatedly scan all edges to determine which nodes are reachable, potentially recomputing the same connectivity information many times. For example, for every node we might run a fresh traversal without remembering previously visited components.

This approach is correct because eventually every reachable node relationship is explored. However, it performs a large amount of redundant work. The same connected region may be traversed over and over again.

In the worst case, repeatedly searching from each node leads to roughly O(n * (n + e)) complexity, where:

  • n is the number of nodes
  • e is the number of edges

This becomes inefficient as the graph grows.

Optimal Approach

The key insight is that each connected component only needs to be explored once.

If we start from an unvisited node and perform a graph traversal, either DFS or BFS, we will visit every node in that connected component. Once visited, those nodes never need to be explored again.

The algorithm becomes:

  1. Build an adjacency list representation of the graph.
  2. Maintain a visited set.
  3. Iterate through every node.
  4. Whenever we encounter an unvisited node, start a DFS/BFS from it.
  5. Mark all reachable nodes as visited.
  6. Increment the component count.

Each node and edge is processed only once, giving an efficient linear-time solution.

Approach Time Complexity Space Complexity Notes
Brute Force O(n × (n + e)) O(n) Repeatedly traverses the same regions
Optimal DFS/BFS O(n + e) O(n + e) Visits each node and edge once

Algorithm Walkthrough

Optimal DFS Algorithm

  1. Create an adjacency list for the graph.

Since the graph is undirected, each edge [a, b] must be added in both directions:

  • a -> b
  • b -> a

An adjacency list is chosen because it efficiently stores sparse graphs and allows fast traversal of neighbors. 2. Create a visited set.

This tracks which nodes have already been explored. Without it, the traversal could revisit nodes indefinitely in cyclic graphs. 3. Initialize a counter called components to 0.

This variable stores the number of connected components discovered so far. 4. Iterate through every node from 0 to n - 1.

Each node represents a possible starting point for a new connected component. 5. If the current node has not been visited, start a DFS traversal from it.

This means we have discovered a completely new connected component. 6. During DFS:

  • Mark the current node as visited.
  • Recursively visit all unvisited neighbors.

The recursion explores the entire connected region. 7. After DFS completes, increment the component count.

At this point, the entire component has been fully explored. 8. Continue scanning remaining nodes.

Any node already visited belongs to a previously discovered component, so it can be skipped. 9. Return the final component count.

Why it works

The algorithm relies on a fundamental graph property:

  • DFS starting from any node visits exactly all nodes reachable from that node.

Therefore:

  • Every DFS traversal identifies one complete connected component.
  • No component is counted twice because visited nodes are skipped.
  • Every node belongs to exactly one traversal.

This guarantees the final count is correct.

Python Solution

from typing import List
from collections import defaultdict

class Solution:
    def countComponents(self, n: int, edges: List[List[int]]) -> int:
        graph = defaultdict(list)

        # Build adjacency list
        for a, b in edges:
            graph[a].append(b)
            graph[b].append(a)

        visited = set()

        def dfs(node: int) -> None:
            visited.add(node)

            for neighbor in graph[node]:
                if neighbor not in visited:
                    dfs(neighbor)

        components = 0

        for node in range(n):
            if node not in visited:
                dfs(node)
                components += 1

        return components

The implementation begins by constructing an adjacency list using defaultdict(list). This allows efficient neighbor lookup for every node.

Each edge is inserted in both directions because the graph is undirected.

The visited set prevents revisiting nodes. This is critical for both correctness and efficiency, especially when cycles exist in the graph.

The nested dfs function recursively explores all nodes reachable from the current node. Every visited neighbor continues the traversal deeper into the same connected component.

The outer loop scans every node. Whenever an unvisited node is found, it represents a new connected component, so DFS is launched and the component count increases.

Finally, the total number of connected components is returned.

Go Solution

package main

func countComponents(n int, edges [][]int) int {
	graph := make([][]int, n)

	// Build adjacency list
	for _, edge := range edges {
		a := edge[0]
		b := edge[1]

		graph[a] = append(graph[a], b)
		graph[b] = append(graph[b], a)
	}

	visited := make([]bool, n)

	var dfs func(int)

	dfs = func(node int) {
		visited[node] = true

		for _, neighbor := range graph[node] {
			if !visited[neighbor] {
				dfs(neighbor)
			}
		}
	}

	components := 0

	for node := 0; node < n; node++ {
		if !visited[node] {
			dfs(node)
			components++
		}
	}

	return components
}

The Go implementation follows the same overall structure as the Python version.

Instead of a hash-based adjacency structure, Go uses a slice of slices:

graph := make([][]int, n)

This works efficiently because node labels are guaranteed to be within [0, n-1].

The visited structure is implemented using a boolean slice rather than a set:

visited := make([]bool, n)

This provides constant-time access with lower overhead.

Go closures require explicit declaration of the recursive function variable before assignment:

var dfs func(int)

No integer overflow concerns exist because the constraints are small.

Worked Examples

Example 1

n = 5
edges = [[0,1],[1,2],[3,4]]

Step 1: Build adjacency list

Node Neighbors
0 [1]
1 [0,2]
2 [1]
3 [4]
4 [3]

Step 2: Traverse nodes

Initial state:

Variable Value
visited {}
components 0

Visit node 0

Node 0 is unvisited, start DFS.

DFS traversal order:

0 -> 1 -> 2

Updated state:

Variable Value
visited {0,1,2}
components 1

Visit node 1

Already visited, skip.

Visit node 2

Already visited, skip.

Visit node 3

Unvisited, start DFS.

DFS traversal order:

3 -> 4

Updated state:

Variable Value
visited {0,1,2,3,4}
components 2

Visit node 4

Already visited, skip.

Final answer:

2

Example 2

n = 5
edges = [[0,1],[1,2],[2,3],[3,4]]

Step 1: Build adjacency list

Node Neighbors
0 [1]
1 [0,2]
2 [1,3]
3 [2,4]
4 [3]

Step 2: Traverse

Start DFS from node 0.

Traversal order:

0 -> 1 -> 2 -> 3 -> 4

After DFS:

Variable Value
visited {0,1,2,3,4}
components 1

All remaining nodes are already visited.

Final answer:

1

Complexity Analysis

Measure Complexity Explanation
Time O(n + e) Each node and edge is visited at most once
Space O(n + e) Adjacency list plus visited structure

The adjacency list stores every edge twice because the graph is undirected, resulting in O(e) storage.

The DFS traversal visits each node once and processes each edge once, producing linear runtime relative to the graph size.

The recursion stack may grow to O(n) in the worst case for a long chain-shaped graph.

Test Cases

from typing import List

class Solution:
    def countComponents(self, n: int, edges: List[List[int]]) -> int:
        graph = [[] for _ in range(n)]

        for a, b in edges:
            graph[a].append(b)
            graph[b].append(a)

        visited = set()

        def dfs(node):
            visited.add(node)

            for neighbor in graph[node]:
                if neighbor not in visited:
                    dfs(neighbor)

        components = 0

        for node in range(n):
            if node not in visited:
                dfs(node)
                components += 1

        return components

solution = Solution()

assert solution.countComponents(
    5,
    [[0,1],[1,2],[3,4]]
) == 2  # Two disconnected groups

assert solution.countComponents(
    5,
    [[0,1],[1,2],[2,3],[3,4]]
) == 1  # Fully connected graph

assert solution.countComponents(
    4,
    []
) == 4  # No edges, every node isolated

assert solution.countComponents(
    1,
    []
) == 1  # Single node graph

assert solution.countComponents(
    6,
    [[0,1],[2,3],[4,5]]
) == 3  # Multiple disconnected pairs

assert solution.countComponents(
    7,
    [[0,1],[1,2],[3,4]]
) == 4  # Mixed connected and isolated nodes

assert solution.countComponents(
    5,
    [[0,1],[1,2],[2,0],[3,4]]
) == 2  # Graph containing a cycle

assert solution.countComponents(
    10,
    [[0,1],[1,2],[2,3],[4,5],[5,6],[7,8]]
) == 4  # Several component sizes
Test Why
[[0,1],[1,2],[3,4]] Basic disconnected graph
Chain connecting all nodes Verifies single component case
Empty edge list Ensures isolated nodes are counted
Single node graph Smallest valid input
Multiple disconnected pairs Verifies separate small components
Mixed connected and isolated nodes Tests combination scenarios
Graph with cycle Ensures visited logic prevents infinite recursion
Larger mixed graph Stresses traversal correctness

Edge Cases

Graph with No Edges

If edges is empty, every node is isolated and therefore forms its own connected component.

A buggy implementation might incorrectly return 0 because no traversal occurs through edges. The correct approach still iterates through all nodes individually. Since each node is unvisited, DFS starts once per node, resulting in n components.

Fully Connected Graph

When all nodes belong to one connected structure, the answer should be 1.

An incorrect implementation might accidentally double-count components if visited nodes are not tracked properly. The visited set guarantees that after the first DFS traversal completes, all remaining nodes are skipped.

Graph Containing Cycles

Cycles are common in undirected graphs.

For example:

0 - 1
|   |
2 - 3

Without proper visited tracking, DFS would recurse forever by repeatedly revisiting already explored nodes.

The implementation prevents this by checking:

if neighbor not in visited:

before recursing.

Presence of Isolated Nodes

Some nodes may not appear in the edges list at all.

For example:

n = 5
edges = [[0,1]]

Nodes 2, 3, and 4 are isolated.

A flawed solution that only iterates over adjacency list keys would miss these nodes entirely. The correct implementation loops through all nodes from 0 to n - 1, ensuring isolated nodes are counted properly.