LeetCode 1857 - Largest Color Value in a Directed Graph

The problem asks us to find the largest color value along any path in a directed graph of n nodes, where each node has a color represented by a lowercase English letter. The graph may contain cycles, in which case the function should return -1.

LeetCode Problem 1857

Difficulty: 🔴 Hard
Topics: Hash Table, String, Dynamic Programming, Graph Theory, Topological Sort, Memoization, Counting

Solution

Problem Understanding

The problem asks us to find the largest color value along any path in a directed graph of n nodes, where each node has a color represented by a lowercase English letter. The graph may contain cycles, in which case the function should return -1.

The input consists of a string colors where colors[i] is the color of node i, and a 2D array edges where each element [a, b] indicates a directed edge from node a to node b. The output is a single integer representing the maximum number of nodes in a path that share the most frequently occurring color on that path.

The constraints indicate that n and m can each be up to 10^5, so any brute-force approach that explores all paths explicitly would be too slow. The graph is not guaranteed to be acyclic, so cycle detection is crucial. Edge cases include single-node graphs, self-loops, and disconnected components.

Approaches

The brute-force approach would involve enumerating all possible paths in the graph, counting the frequency of each color along each path, and keeping track of the maximum frequency. While correct, this is infeasible because the number of paths can grow exponentially with the number of nodes due to cycles and branching.

The key observation for an optimal solution is that this problem can be treated as a topological sorting problem with dynamic programming. If we can traverse the graph in topological order, we can propagate color frequencies along the paths. Each node maintains a frequency array of size 26 representing the maximum occurrences of each color along all paths ending at that node. By updating the frequency arrays from parent to child, we can compute the largest color value efficiently. To detect cycles, we can track in-degrees and use a Kahn’s algorithm style topological sort.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^n) O(n) Enumerate all paths, infeasible for n ~ 10^5
Optimal O(n + m) O(n * 26) Topological sort + dynamic programming with color frequency propagation

Algorithm Walkthrough

  1. Graph Representation: Build an adjacency list from the edges. For each node, maintain a list of its outgoing neighbors. Also, calculate the in-degree of each node.
  2. Topological Sort Initialization: Initialize a queue with all nodes having zero in-degree. These nodes have no dependencies and can be processed first.
  3. Dynamic Programming Table: For each node, maintain a count array of size 26, representing the maximum frequency of each color along any path ending at that node. Initialize count[i][color_index] = 1 for each node.
  4. Processing Nodes: While the queue is not empty, pop a node u. For each neighbor v of u, update v’s count array such that for each color c, count[v][c] = max(count[v][c], count[u][c] + (colors[v] == c)). Decrement the in-degree of v. If in-degree becomes zero, enqueue v.
  5. Cycle Detection: Keep a processed node counter. If at the end of the BFS, the number of processed nodes is less than n, a cycle exists, return -1.
  6. Result Calculation: During propagation, maintain a global maximum of count[u][c] across all nodes and colors. Return this maximum as the largest color value.

Why it works: By using a topological order, we guarantee that when processing a node, all paths leading to it have already been considered. The DP array ensures that each node’s color contribution is correctly propagated along paths. Cycle detection ensures correctness in graphs that are not DAGs.

Python Solution

from collections import deque, defaultdict
from typing import List

class Solution:
    def largestPathValue(self, colors: str, edges: List[List[int]]) -> int:
        n = len(colors)
        graph = defaultdict(list)
        indegree = [0] * n
        
        # Build graph
        for u, v in edges:
            graph[u].append(v)
            indegree[v] += 1
        
        # DP table for each node: count of each color along path ending at node
        dp = [[0] * 26 for _ in range(n)]
        for i in range(n):
            dp[i][ord(colors[i]) - ord('a')] = 1
        
        # Topological sort
        queue = deque([i for i in range(n) if indegree[i] == 0])
        processed = 0
        max_color_value = 0
        
        while queue:
            u = queue.popleft()
            processed += 1
            for v in graph[u]:
                for c in range(26):
                    increment = 1 if c == ord(colors[v]) - ord('a') else 0
                    dp[v][c] = max(dp[v][c], dp[u][c] + increment)
                indegree[v] -= 1
                if indegree[v] == 0:
                    queue.append(v)
            max_color_value = max(max_color_value, max(dp[u]))
        
        return max_color_value if processed == n else -1

The Python implementation builds the graph, initializes the DP table for color counts, and processes nodes in topological order. For each node, the DP array is updated by considering all parent nodes and incrementing the count if the node’s color matches. Cycle detection is implemented by comparing the number of processed nodes to n.

Go Solution

package main

func largestPathValue(colors string, edges [][]int) int {
    n := len(colors)
    graph := make([][]int, n)
    indegree := make([]int, n)
    
    for _, edge := range edges {
        u, v := edge[0], edge[1]
        graph[u] = append(graph[u], v)
        indegree[v]++
    }
    
    dp := make([][26]int, n)
    for i := 0; i < n; i++ {
        dp[i][colors[i]-'a'] = 1
    }
    
    queue := []int{}
    for i := 0; i < n; i++ {
        if indegree[i] == 0 {
            queue = append(queue, i)
        }
    }
    
    processed := 0
    maxColor := 0
    
    for len(queue) > 0 {
        u := queue[0]
        queue = queue[1:]
        processed++
        for _, v := range graph[u] {
            for c := 0; c < 26; c++ {
                increment := 0
                if byte(c+'a') == colors[v] {
                    increment = 1
                }
                if dp[u][c]+increment > dp[v][c] {
                    dp[v][c] = dp[u][c] + increment
                }
            }
            indegree[v]--
            if indegree[v] == 0 {
                queue = append(queue, v)
            }
        }
        for c := 0; c < 26; c++ {
            if dp[u][c] > maxColor {
                maxColor = dp[u][c]
            }
        }
    }
    
    if processed < n {
        return -1
    }
    return maxColor
}

The Go version mirrors the Python approach, using slices and arrays instead of Python lists and dictionaries. The main differences are explicit slice management for the queue and careful handling of byte-to-int conversion for colors.

Worked Examples

Example 1: colors = "abaca", edges = [[0,1],[0,2],[2,3],[3,4]]

Node DP Array (max frequency per color) Queue State Max Color Value
0 [1,0,...] [1,2] 1
1 [1,1,...] [] 1
2 [2,0,...] [3] 2
3 [2,0,1,...] [4] 2
4 [3,0,1,...] [] 3

Output: 3

Example 2: colors = "a", edges = [[0,0]]

Cycle detected, output: -1.

Complexity Analysis

Measure Complexity Explanation
Time O(n + m) Each node is processed once, each edge is traversed once, 26-color DP array updates are constant
Space O(n * 26 + m) DP array of size 26 per node, adjacency list for edges

The algorithm is linear with respect to the number of nodes and edges, which is efficient for the problem constraints.

Test Cases

# test cases
assert Solution().largestPathValue("abaca", [[0,1],[0,2],[2,3],[3,4]]) == 3  # Example 1
assert Solution().largestPathValue("a", [[0,0]]) == -1  # Self-loop cycle
assert Solution().largestPathValue("a", []) == 1  # Single node,