LeetCode 1548 - The Most Similar Path in a Graph

The problem gives us an undirected connected graph where each node represents a city, and each city has a three letter name. We are also given a target sequence of names called targetPath. Our goal is to construct a valid path through the graph such that: 1.

LeetCode Problem 1548

Difficulty: 🔴 Hard
Topics: Dynamic Programming, Graph Theory

Solution

LeetCode 1548, The Most Similar Path in a Graph

Problem Understanding

The problem gives us an undirected connected graph where each node represents a city, and each city has a three letter name. We are also given a target sequence of names called targetPath.

Our goal is to construct a valid path through the graph such that:

  1. The path length is exactly equal to the length of targetPath
  2. Consecutive cities in the path must be connected by a road
  3. The sequence of city names along the chosen path should be as similar as possible to targetPath

Similarity is measured using edit distance in a simplified form. At each position:

  • If the chosen city's name equals targetPath[i], the cost is 0
  • Otherwise, the cost is 1

The total edit distance is simply the number of mismatched positions.

We must return the actual sequence of node indices that minimizes this total mismatch count.

For example, suppose:

  • targetPath = ["ATL", "DXB", "HND", "LAX"]

and we choose nodes corresponding to:

  • ["ATL", "LAX", "HND", "LAX"]

Then the mismatch count is:

Position Target Chosen Match?
0 ATL ATL Yes
1 DXB LAX No
2 HND HND Yes
3 LAX LAX Yes

Total edit distance = 1.

The graph is guaranteed to be connected, which means there is always at least one valid path of any required length because we can revisit nodes multiple times.

The constraints are important:

  • n <= 100
  • targetPath.length <= 100

These limits are small enough for dynamic programming over positions and nodes.

A naive brute force enumeration of all paths would explode exponentially because each step may branch into many neighboring cities.

Important edge cases include:

  • Multiple cities can share the same name
  • The optimal path may revisit nodes many times
  • The graph may be dense or sparse
  • None of the city names may match the target at all
  • Multiple optimal answers may exist

The problem allows returning any minimum cost path.

Approaches

Brute Force Approach

A brute force solution would generate every possible valid path of length len(targetPath).

Starting from every node:

  1. Recursively try every neighboring node
  2. Build all possible paths of the required length
  3. Compute the mismatch count for each path
  4. Return the minimum one

This works because it exhaustively checks all valid possibilities.

However, it is far too slow.

If the average degree is d and the target length is m, then the number of paths is approximately:

$$O(n \cdot d^{m-1})$$

With m = 100, this becomes completely infeasible.

Optimal Dynamic Programming Approach

The key observation is that the problem has overlapping subproblems.

Suppose we define:

  • dp[i][city] = minimum edit distance for constructing a valid path of length i + 1 that ends at city

To compute this state:

  • We consider every previous neighboring city
  • Transition from the best previous state
  • Add the mismatch cost for the current position

This transforms the problem into shortest path style dynamic programming over:

  • path position
  • current node

We also store parent pointers so we can reconstruct the final path afterward.

This reduces the exponential search into polynomial time.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n * d^(m-1)) O(m) Enumerates all possible paths
Optimal DP O(m * (n + m_edges)) or O(m * m_edges) O(m * n) Dynamic programming over positions and nodes

Where:

  • m = len(targetPath)
  • m_edges = roads.length

Algorithm Walkthrough

Step 1: Build the Graph

Convert the road list into an adjacency list.

This allows efficient neighbor traversal during DP transitions.

For every road [u, v]:

  • add v to u's adjacency list
  • add u to v's adjacency list

Since the graph is undirected, both directions matter.

Step 2: Define the DP State

Let:

$$dp[i][city]$$

represent the minimum mismatch cost for a valid path:

  • using the first i + 1 positions
  • ending at city

We also maintain:

$$parent[i][city]$$

which stores the previous city used to reach this optimal state.

This is necessary for reconstructing the final path later.

Step 3: Initialize the First Layer

At position 0, we may start from any city.

The cost is:

  • 0 if names[city] == targetPath[0]
  • otherwise 1

So:

$$dp[0][city] = \begin{cases} 0 & \text{if names[city] matches targetPath[0]} \ 1 & \text{otherwise} \end{cases}$$

Step 4: Perform State Transitions

For each position i from 1 to m - 1:

For each current city curr:

  1. Compute mismatch cost:
  • 0 if names match
  • 1 otherwise
  1. Try every neighboring city prev
  2. Candidate cost becomes:

$$dp[i-1][prev] + mismatch$$

  1. Take the minimum among all valid predecessors
  2. Store both:
  • the minimum cost
  • the predecessor city

Step 5: Find the Best Final City

After filling the DP table:

  • inspect all cities at the last position
  • choose the city with minimum cost

This gives the endpoint of the optimal path.

Step 6: Reconstruct the Path

Using the parent table:

  1. Start from the best ending city
  2. Move backward position by position
  3. Recover all previous cities
  4. Reverse the result

The reconstructed sequence is the optimal path.

Why it works

The DP state guarantees that every state stores the minimum possible mismatch cost for paths ending at a specific city and position.

Every valid path of length i + 1 ending at curr must come from some neighboring city at position i - 1. By considering all such neighbors and choosing the minimum transition cost, we ensure optimal substructure.

Since all states are computed optimally and every transition is considered exactly once, the final reconstructed path must have minimum edit distance.

Python Solution

from typing import List

class Solution:
    def mostSimilar(
        self,
        n: int,
        roads: List[List[int]],
        names: List[str],
        targetPath: List[str]
    ) -> List[int]:

        target_length = len(targetPath)

        # Build adjacency list
        graph = [[] for _ in range(n)]

        for u, v in roads:
            graph[u].append(v)
            graph[v].append(u)

        INF = float('inf')

        # dp[i][city] = minimum cost to reach city at position i
        dp = [[INF] * n for _ in range(target_length)]

        # parent[i][city] = previous city used
        parent = [[-1] * n for _ in range(target_length)]

        # Initialize first position
        for city in range(n):
            dp[0][city] = 0 if names[city] == targetPath[0] else 1

        # Fill DP table
        for i in range(1, target_length):

            for curr_city in range(n):

                mismatch_cost = (
                    0
                    if names[curr_city] == targetPath[i]
                    else 1
                )

                for prev_city in graph[curr_city]:

                    candidate = (
                        dp[i - 1][prev_city]
                        + mismatch_cost
                    )

                    if candidate < dp[i][curr_city]:
                        dp[i][curr_city] = candidate
                        parent[i][curr_city] = prev_city

        # Find best ending city
        end_city = 0

        for city in range(1, n):
            if dp[target_length - 1][city] < dp[target_length - 1][end_city]:
                end_city = city

        # Reconstruct path
        path = [0] * target_length

        current = end_city

        for i in range(target_length - 1, -1, -1):
            path[i] = current
            current = parent[i][current]

        return path

The implementation begins by constructing an adjacency list representation of the graph. This is important because DP transitions depend on quickly enumerating neighboring cities.

The dp table stores the minimum mismatch cost for each position and ending city. Initially, every value is set to infinity so that any valid candidate becomes an improvement.

The first DP layer is initialized independently because the path may start at any city.

For every later position, the algorithm evaluates all possible predecessor cities reachable through a road. Each transition adds either 0 or 1 depending on whether the current city's name matches the target word.

Whenever a better transition is found, the DP value and corresponding parent pointer are updated.

After all states are processed, the algorithm selects the minimum cost city in the final layer and reconstructs the path using the parent table.

Go Solution

package main

import "math"

func mostSimilar(n int, roads [][]int, names []string, targetPath []string) []int {

	targetLength := len(targetPath)

	// Build adjacency list
	graph := make([][]int, n)

	for _, road := range roads {
		u := road[0]
		v := road[1]

		graph[u] = append(graph[u], v)
		graph[v] = append(graph[v], u)
	}

	inf := math.MaxInt32

	// dp[i][city] = minimum cost
	dp := make([][]int, targetLength)

	// parent[i][city] = previous city
	parent := make([][]int, targetLength)

	for i := 0; i < targetLength; i++ {
		dp[i] = make([]int, n)
		parent[i] = make([]int, n)

		for city := 0; city < n; city++ {
			dp[i][city] = inf
			parent[i][city] = -1
		}
	}

	// Initialize first row
	for city := 0; city < n; city++ {
		if names[city] == targetPath[0] {
			dp[0][city] = 0
		} else {
			dp[0][city] = 1
		}
	}

	// Fill DP
	for i := 1; i < targetLength; i++ {

		for currCity := 0; currCity < n; currCity++ {

			mismatchCost := 1

			if names[currCity] == targetPath[i] {
				mismatchCost = 0
			}

			for _, prevCity := range graph[currCity] {

				candidate := dp[i-1][prevCity] + mismatchCost

				if candidate < dp[i][currCity] {
					dp[i][currCity] = candidate
					parent[i][currCity] = prevCity
				}
			}
		}
	}

	// Find best ending city
	endCity := 0

	for city := 1; city < n; city++ {
		if dp[targetLength-1][city] < dp[targetLength-1][endCity] {
			endCity = city
		}
	}

	// Reconstruct path
	path := make([]int, targetLength)

	current := endCity

	for i := targetLength - 1; i >= 0; i-- {
		path[i] = current
		current = parent[i][current]
	}

	return path
}

The Go implementation follows the same DP logic as the Python solution.

Slices are used for adjacency lists and DP tables. Since Go does not have built in infinity values for integers, math.MaxInt32 is used as a large sentinel value.

Parent pointers are initialized to -1 to indicate no predecessor.

Go arrays and slices are zero initialized automatically, so explicit initialization is only necessary for the infinity values and parent states.

Worked Examples

Example 1

Input

n = 5
roads = [[0,2],[0,3],[1,2],[1,3],[1,4],[2,4]]
names = ["ATL","PEK","LAX","DXB","HND"]
targetPath = ["ATL","DXB","HND","LAX"]

Graph

City Neighbors
0 2, 3
1 2, 3, 4
2 0, 1, 4
3 0, 1
4 1, 2

DP Initialization

Target word = "ATL"

City Name Cost
0 ATL 0
1 PEK 1
2 LAX 1
3 DXB 1
4 HND 1

Position 1, target = "DXB"

For city 3:

  • name matches "DXB"
  • mismatch cost = 0

Possible predecessors:

  • city 0 → cost = 0 + 0 = 0
  • city 1 → cost = 1 + 0 = 1

Best = 0 from city 0.

Position 2, target = "HND"

For city 4:

  • name matches "HND"
  • mismatch cost = 0

Possible predecessors:

  • city 1
  • city 2

Best predecessor gives total cost 0.

Position 3, target = "LAX"

For city 2:

  • name matches "LAX"
  • mismatch cost = 0

Best predecessor becomes city 4.

Final path reconstruction:

2 <- 4 <- 2 <- 0

Reverse:

[0, 2, 4, 2]

Total edit distance = 1.

Example 2

Input

targetPath = ["ABC","DEF","GHI","JKL","MNO","PQR","STU","VWX"]

No city name matches any target word.

Therefore every position contributes cost 1.

Total cost:

8

Any valid path of length 8 is optimal.

One valid answer:

[0,1,0,1,0,1,0,1]

Example 3

Input

names = ["ATL","PEK","LAX","ATL","DXB","HND"]
targetPath = ["ATL","DXB","HND","DXB","ATL","LAX","PEK"]

The graph forms a chain:

0 - 1 - 2 - 3 - 4 - 5

The only perfect matching sequence is:

3 -> 4 -> 5 -> 4 -> 3 -> 2 -> 1

Corresponding names:

ATL -> DXB -> HND -> DXB -> ATL -> LAX -> PEK

Edit distance:

0

Complexity Analysis

Measure Complexity Explanation
Time O(m * E) Each DP layer processes all graph edges
Space O(m * n) DP and parent tables

Where:

  • m = len(targetPath)
  • E = roads.length

The algorithm processes each target position independently. For every position, transitions examine all neighboring edges.

Since the graph contains at most n * (n - 1) / 2 edges and n <= 100, this runs comfortably within limits.

The space complexity comes primarily from storing:

  • the DP table
  • the parent reconstruction table

Both contain m * n entries.

Test Cases

from typing import List

def is_valid_path(path: List[int], roads: List[List[int]]) -> bool:
    edges = set()

    for u, v in roads:
        edges.add((u, v))
        edges.add((v, u))

    for i in range(len(path) - 1):
        if (path[i], path[i + 1]) not in edges:
            return False

    return True

sol = Solution()

# Example 1
n = 5
roads = [[0,2],[0,3],[1,2],[1,3],[1,4],[2,4]]
names = ["ATL","PEK","LAX","DXB","HND"]
target = ["ATL","DXB","HND","LAX"]

result = sol.mostSimilar(n, roads, names, target)

assert len(result) == 4  # correct length
assert is_valid_path(result, roads)  # valid graph path

# Example 2
n = 4
roads = [[1,0],[2,0],[3,0],[2,1],[3,1],[3,2]]
names = ["ATL","PEK","LAX","DXB"]
target = ["ABC","DEF","GHI","JKL","MNO","PQR","STU","VWX"]

result = sol.mostSimilar(n, roads, names, target)

assert len(result) == 8  # long target path
assert is_valid_path(result, roads)  # still valid

# Example 3
n = 6
roads = [[0,1],[1,2],[2,3],[3,4],[4,5]]
names = ["ATL","PEK","LAX","ATL","DXB","HND"]
target = ["ATL","DXB","HND","DXB","ATL","LAX","PEK"]

result = sol.mostSimilar(n, roads, names, target)

assert result == [3,4,5,4,3,2,1]  # unique optimal path

# Minimum graph size
n = 2
roads = [[0,1]]
names = ["AAA", "BBB"]
target = ["AAA"]

result = sol.mostSimilar(n, roads, names, target)

assert result == [0]  # single node path

# Repeated node usage
n = 2
roads = [[0,1]]
names = ["AAA", "BBB"]
target = ["AAA", "BBB", "AAA", "BBB"]

result = sol.mostSimilar(n, roads, names, target)

assert result == [0,1,0,1]  # alternating revisits

# Multiple identical names
n = 3
roads = [[0,1],[1,2]]
names = ["AAA", "AAA", "AAA"]
target = ["AAA", "AAA", "AAA"]

result = sol.mostSimilar(n, roads, names, target)

assert len(result) == 3
assert is_valid_path(result, roads)

# Dense graph
n = 4
roads = [
    [0,1],[0,2],[0,3],
    [1,2],[1,3],[2,3]
]
names = ["AAB","AAC","AAD","AAE"]
target = ["AAB","AAC","AAD","AAE"]

result = sol.mostSimilar(n, roads, names, target)

assert len(result) == 4
assert is_valid_path(result, roads)

Test Summary

Test Why
Example 1 Verifies normal DP behavior
Example 2 Verifies all mismatch scenario
Example 3 Verifies exact match reconstruction
Minimum graph size Tests smallest valid input
Repeated node usage Ensures revisits work correctly
Multiple identical names Verifies duplicate names handling
Dense graph Tests many transition choices

Edge Cases

One important edge case occurs when no city name matches any target string. In this situation, every position contributes a mismatch cost of 1. A naive implementation might incorrectly assume at least one matching node exists and fail to initialize states properly. This implementation handles the case naturally because mismatch cost is computed independently for every node.

Another tricky scenario is when multiple cities share the same name. The algorithm cannot simply map names to unique nodes. Instead, the DP state is indexed by actual city index, not by city name. This ensures different graph positions remain distinguishable even if their names are identical.

A third important edge case involves repeated node visits. Since revisiting is allowed, the optimal path may bounce between a small number of cities many times. Some graph algorithms mistakenly mark nodes as visited globally, which would incorrectly eliminate valid solutions. This implementation never uses a visited set, so repeated traversal is fully supported.

A fourth subtle edge case is path reconstruction. If parent pointers are not stored correctly during DP transitions, the algorithm may compute the correct minimum cost but fail to recover the actual path. The dedicated parent table ensures every optimal transition is remembered and can later be reconstructed exactly.