LeetCode 924 - Minimize Malware Spread

In this problem, we are given an undirected graph represented as an adjacency matrix. Each node represents a computer in a network, and an edge between two nodes means those computers are directly connected. Some subset of nodes is initially infected with malware.

LeetCode Problem 924

Difficulty: 🔴 Hard
Topics: Array, Hash Table, Depth-First Search, Breadth-First Search, Union-Find, Graph Theory

Solution

Problem Understanding

In this problem, we are given an undirected graph represented as an adjacency matrix. Each node represents a computer in a network, and an edge between two nodes means those computers are directly connected.

Some subset of nodes is initially infected with malware. Malware spreads through connections, meaning that if an infected node is connected to another node, the infection eventually spreads to that node as well. Because the graph is undirected and infection keeps propagating until no new nodes can be infected, every node inside a connected component becomes infected if that component contains at least one initially infected node.

We are allowed to remove exactly one node from the initial infected list before the malware starts spreading. The removed node is no longer initially infected, although it may still become infected later through another infected node in the same connected component.

Our goal is to choose the node whose removal minimizes the final number of infected nodes after the malware spread completes. If multiple choices produce the same minimum infection count, we must return the node with the smallest index.

The input graph has at most 300 nodes, which is small enough for graph traversal algorithms like DFS, BFS, or Union-Find. Since the graph is given as an adjacency matrix, scanning neighbors of a node requires O(n) time.

The key observation is that malware spreads independently inside each connected component. If a connected component contains:

  • No infected nodes, it remains clean.
  • One infected node, removing that infected node completely saves the component.
  • More than one infected node, removing only one of them does not help because the remaining infected nodes still infect the entire component.

Several edge cases are important:

  • Multiple infected nodes may exist in the same connected component.
  • A removed node may still become infected later.
  • Completely disconnected graphs must still follow the smallest index tie-breaking rule.
  • If no removal improves the situation, we return the smallest node index from initial.

Approaches

Brute Force Approach

A straightforward solution is to try removing every node in initial one by one.

For each candidate removal:

  1. Temporarily exclude that node from the infected set.
  2. Simulate malware spread using DFS or BFS.
  3. Count how many nodes become infected.
  4. Keep track of the removal producing the smallest final infection count.

This works because it directly follows the problem definition. However, it repeatedly performs graph traversals for every candidate node.

If there are k infected nodes and n total nodes, each simulation may take O(n²) time because the graph is stored as an adjacency matrix. Repeating this for all infected nodes gives roughly O(k * n²) complexity.

Although n <= 300 is not enormous, we can do better by exploiting connected component structure.

Optimal Approach

The optimal insight is that malware spread is determined entirely by connected components.

Suppose a connected component has size S.

  • If the component contains exactly one initially infected node, removing that node prevents infection of all S nodes.
  • If the component contains multiple initially infected nodes, removing one does not matter because the others still infect the whole component.

Therefore, we only care about components with exactly one infected node.

The algorithm becomes:

  1. Find all connected components.
  2. Compute:
  • The size of each component.
  • How many infected nodes each component contains.
  1. For every infected node:
  • If its component contains exactly one infected node, removing it saves the entire component size.
  1. Choose the node that saves the most nodes.
  2. Break ties by smaller index.

This avoids repeated infection simulations.

Approach Time Complexity Space Complexity Notes
Brute Force O(k * n²) O(n) Simulate malware spread for every possible removal
Optimal O(n²) O(n) Uses connected components to compute impact directly

Algorithm Walkthrough

Step 1: Find Connected Components

We traverse the graph using DFS.

Each node belongs to exactly one connected component. During traversal, we assign every node a component ID.

We also compute the size of each component because component size determines how many nodes can potentially be saved.

An adjacency matrix is used, so for each node we scan all n possible neighbors.

Step 2: Record Component Sizes

While performing DFS, we count how many nodes belong to the current component.

We store:

  • component_id[node], which component each node belongs to
  • component_size[id], number of nodes in that component

This information lets us later determine how valuable removing an infected node would be.

Step 3: Count Infected Nodes Per Component

We iterate through the initial array.

For every infected node:

  1. Find its component ID.
  2. Increment the infected count for that component.

Now we know whether a component has:

  • Exactly one infected node
  • Multiple infected nodes

This distinction is the core of the optimization.

Step 4: Evaluate Each Candidate Removal

We sort initial so tie-breaking becomes easier.

For each infected node:

  1. Find its component.
  2. Check how many infected nodes exist in that component.

If the count is exactly 1, removing this node prevents infection of the entire component. The number of saved nodes equals the component size.

If the count is greater than 1, removing the node changes nothing because other infected nodes still spread malware.

We keep track of:

  • The best node to remove
  • The maximum number of saved nodes

Step 5: Apply Tie Breaking

If two nodes save the same number of nodes, we return the smaller index.

Sorting initial beforehand naturally guarantees this behavior because we process smaller indices first.

Why it works

The algorithm works because malware spreads independently inside connected components. A component becomes fully infected if at least one infected node remains inside it.

Therefore:

  • Components with multiple infected nodes cannot be saved by removing only one node.
  • Components with exactly one infected node can be completely protected by removing that node.

Choosing the node that uniquely infects the largest component minimizes the total final infections.

Python Solution

from typing import List

class Solution:
    def minMalwareSpread(self, graph: List[List[int]], initial: List[int]) -> int:
        n = len(graph)

        component_id = [-1] * n
        component_size = []

        def dfs(node: int, current_id: int) -> int:
            component_id[node] = current_id
            size = 1

            for neighbor in range(n):
                if (
                    graph[node][neighbor] == 1
                    and component_id[neighbor] == -1
                ):
                    size += dfs(neighbor, current_id)

            return size

        current_component = 0

        for node in range(n):
            if component_id[node] == -1:
                size = dfs(node, current_component)
                component_size.append(size)
                current_component += 1

        infected_count = [0] * current_component

        for node in initial:
            infected_count[component_id[node]] += 1

        initial.sort()

        best_node = initial[0]
        max_saved = -1

        for node in initial:
            comp = component_id[node]

            if infected_count[comp] == 1:
                saved = component_size[comp]

                if saved > max_saved:
                    max_saved = saved
                    best_node = node

        return best_node

The implementation begins by creating arrays to track component membership and component sizes.

The DFS function assigns all reachable nodes to the same component ID while counting the component size. Since the graph is represented as an adjacency matrix, the DFS checks every possible neighbor from 0 to n - 1.

After all connected components are identified, we count how many initially infected nodes belong to each component.

Sorting initial ensures that if multiple nodes save the same number of vertices, the smallest index is chosen automatically.

Finally, we evaluate each infected node. Only nodes that are the sole infected member of their component can save any nodes. The node that saves the largest component is returned.

Go Solution

package main

import "sort"

func minMalwareSpread(graph [][]int, initial []int) int {
	n := len(graph)

	componentID := make([]int, n)
	for i := 0; i < n; i++ {
		componentID[i] = -1
	}

	componentSize := []int{}

	var dfs func(node int, currentID int) int

	dfs = func(node int, currentID int) int {
		componentID[node] = currentID
		size := 1

		for neighbor := 0; neighbor < n; neighbor++ {
			if graph[node][neighbor] == 1 && componentID[neighbor] == -1 {
				size += dfs(neighbor, currentID)
			}
		}

		return size
	}

	currentComponent := 0

	for node := 0; node < n; node++ {
		if componentID[node] == -1 {
			size := dfs(node, currentComponent)
			componentSize = append(componentSize, size)
			currentComponent++
		}
	}

	infectedCount := make([]int, currentComponent)

	for _, node := range initial {
		infectedCount[componentID[node]]++
	}

	sort.Ints(initial)

	bestNode := initial[0]
	maxSaved := -1

	for _, node := range initial {
		comp := componentID[node]

		if infectedCount[comp] == 1 {
			saved := componentSize[comp]

			if saved > maxSaved {
				maxSaved = saved
				bestNode = node
			}
		}
	}

	return bestNode
}

The Go implementation closely mirrors the Python version.

Slices are used instead of dynamic Python lists. The DFS is implemented as a recursive closure so it can access outer variables naturally.

Go requires explicit initialization of componentID to -1, since integer slices default to 0.

Sorting is handled using sort.Ints(initial).

Because the problem constraints are small, recursion depth is safe in Go for this problem size.

Worked Examples

Example 1

graph = [[1,1,0],
         [1,1,0],
         [0,0,1]]

initial = [0,1]

Step 1: Find Components

Nodes 0 and 1 are connected.

Node 2 is isolated.

Component ID Nodes Size
0 [0,1] 2
1 [2] 1

Step 2: Count Infected Nodes

Both 0 and 1 belong to component 0.

Component Infected Count
0 2
1 0

Step 3: Evaluate Removals

Node Removed Component Unique Infection? Saved Nodes
0 0 No 0
1 0 No 0

Neither removal changes the outcome because the other infected node still spreads malware.

Tie breaking chooses the smaller index.

Result:

0

Example 2

graph = [[1,0,0],
         [0,1,0],
         [0,0,1]]

initial = [0,2]

Step 1: Find Components

Each node is isolated.

Component ID Nodes Size
0 [0] 1
1 [1] 1
2 [2] 1

Step 2: Count Infected Nodes

Component Infected Count
0 1
1 0
2 1

Step 3: Evaluate Removals

Node Removed Saved Nodes
0 1
2 1

Both save one node.

Tie breaking chooses 0.

Result:

0

Example 3

graph = [[1,1,1],
         [1,1,1],
         [1,1,1]]

initial = [1,2]

Step 1: Find Components

All nodes belong to one component.

Component ID Nodes Size
0 [0,1,2] 3

Step 2: Count Infected Nodes

Component Infected Count
0 2

Step 3: Evaluate Removals

Node Removed Unique Infection? Saved Nodes
1 No 0
2 No 0

Neither removal helps.

Tie breaking chooses the smaller index.

Result:

1

Complexity Analysis

Measure Complexity Explanation
Time O(n²) DFS scans the adjacency matrix
Space O(n) Component arrays and recursion stack

The graph uses an adjacency matrix representation, so exploring neighbors of a node requires scanning all n entries in its row. Across all DFS traversals, this results in O(n²) total work.

The auxiliary storage consists mainly of component tracking arrays and recursion stack space, both proportional to the number of nodes.

Test Cases

from typing import List

class Solution:
    def minMalwareSpread(self, graph: List[List[int]], initial: List[int]) -> int:
        n = len(graph)

        component_id = [-1] * n
        component_size = []

        def dfs(node: int, current_id: int) -> int:
            component_id[node] = current_id
            size = 1

            for neighbor in range(n):
                if (
                    graph[node][neighbor] == 1
                    and component_id[neighbor] == -1
                ):
                    size += dfs(neighbor, current_id)

            return size

        current_component = 0

        for node in range(n):
            if component_id[node] == -1:
                size = dfs(node, current_component)
                component_size.append(size)
                current_component += 1

        infected_count = [0] * current_component

        for node in initial:
            infected_count[component_id[node]] += 1

        initial.sort()

        best_node = initial[0]
        max_saved = -1

        for node in initial:
            comp = component_id[node]

            if infected_count[comp] == 1:
                saved = component_size[comp]

                if saved > max_saved:
                    max_saved = saved
                    best_node = node

        return best_node

sol = Solution()

assert sol.minMalwareSpread(
    [[1,1,0],[1,1,0],[0,0,1]],
    [0,1]
) == 0  # example 1

assert sol.minMalwareSpread(
    [[1,0,0],[0,1,0],[0,0,1]],
    [0,2]
) == 0  # example 2

assert sol.minMalwareSpread(
    [[1,1,1],[1,1,1],[1,1,1]],
    [1,2]
) == 1  # example 3

assert sol.minMalwareSpread(
    [[1,1,0],[1,1,0],[0,0,1]],
    [0]
) == 0  # single infected node

assert sol.minMalwareSpread(
    [[1,0,0,0],
     [0,1,0,0],
     [0,0,1,0],
     [0,0,0,1]],
    [3,1]
) == 1  # disconnected graph with tie breaking

assert sol.minMalwareSpread(
    [[1,1,0,0],
     [1,1,1,0],
     [0,1,1,0],
     [0,0,0,1]],
    [0,3]
) == 0  # removing node 0 saves larger component

assert sol.minMalwareSpread(
    [[1,1,0,0],
     [1,1,1,0],
     [0,1,1,0],
     [0,0,0,1]],
    [0,1]
) == 0  # same component, removal does not help

assert sol.minMalwareSpread(
    [[1,0],[0,1]],
    [1]
) == 1  # minimum graph size

assert sol.minMalwareSpread(
    [[1,1,1,0],
     [1,1,1,0],
     [1,1,1,0],
     [0,0,0,1]],
    [0,1,3]
) == 3  # unique infected singleton component

assert sol.minMalwareSpread(
    [[1,1,0,0,0],
     [1,1,0,0,0],
     [0,0,1,1,0],
     [0,0,1,1,0],
     [0,0,0,0,1]],
    [0,2]
) == 0  # equal component sizes, choose smaller index
Test Why
Example 1 Multiple infected nodes in same component
Example 2 Completely disconnected graph
Example 3 Fully connected graph
Single infected node Entire component can be saved
Disconnected tie case Verifies smallest-index tie breaking
Larger component save Ensures optimal node selection
Same-component infections Removal provides no benefit
Minimum graph size Smallest valid input
Singleton component infected Unique infection saves isolated node
Equal component sizes Confirms deterministic tie resolution

Edge Cases

Multiple Infected Nodes in One Component

A common mistake is assuming that removing any infected node helps reduce spread. If a component contains multiple infected nodes, removing only one of them does not stop infection because the remaining infected nodes still spread malware throughout the component.

The implementation correctly handles this by counting infected nodes per component and only considering components with exactly one infected node.

Tie Breaking Between Multiple Valid Answers

Several removals may save the same number of nodes. The problem explicitly requires returning the smallest node index in that case.

The implementation sorts initial before processing. Since nodes are evaluated in ascending order, the first node achieving the maximum saved count is automatically the smallest valid answer.

Isolated Nodes

A node may form a component of size one. If that isolated node is infected and uniquely infected, removing it prevents any infection entirely within that component.

The DFS naturally treats isolated nodes as components of size one, so the algorithm handles this case without any special logic.