LeetCode 1548 - The Most Similar Path in a Graph
The problem gives us an undirected connected graph where each node represents a city, and each city has a three letter name. We are also given a target sequence of names called targetPath. Our goal is to construct a valid path through the graph such that: 1.
Difficulty: 🔴 Hard
Topics: Dynamic Programming, Graph Theory
Solution
LeetCode 1548, The Most Similar Path in a Graph
Problem Understanding
The problem gives us an undirected connected graph where each node represents a city, and each city has a three letter name. We are also given a target sequence of names called targetPath.
Our goal is to construct a valid path through the graph such that:
- The path length is exactly equal to the length of
targetPath - Consecutive cities in the path must be connected by a road
- The sequence of city names along the chosen path should be as similar as possible to
targetPath
Similarity is measured using edit distance in a simplified form. At each position:
- If the chosen city's name equals
targetPath[i], the cost is0 - Otherwise, the cost is
1
The total edit distance is simply the number of mismatched positions.
We must return the actual sequence of node indices that minimizes this total mismatch count.
For example, suppose:
targetPath = ["ATL", "DXB", "HND", "LAX"]
and we choose nodes corresponding to:
["ATL", "LAX", "HND", "LAX"]
Then the mismatch count is:
| Position | Target | Chosen | Match? |
|---|---|---|---|
| 0 | ATL | ATL | Yes |
| 1 | DXB | LAX | No |
| 2 | HND | HND | Yes |
| 3 | LAX | LAX | Yes |
Total edit distance = 1.
The graph is guaranteed to be connected, which means there is always at least one valid path of any required length because we can revisit nodes multiple times.
The constraints are important:
n <= 100targetPath.length <= 100
These limits are small enough for dynamic programming over positions and nodes.
A naive brute force enumeration of all paths would explode exponentially because each step may branch into many neighboring cities.
Important edge cases include:
- Multiple cities can share the same name
- The optimal path may revisit nodes many times
- The graph may be dense or sparse
- None of the city names may match the target at all
- Multiple optimal answers may exist
The problem allows returning any minimum cost path.
Approaches
Brute Force Approach
A brute force solution would generate every possible valid path of length len(targetPath).
Starting from every node:
- Recursively try every neighboring node
- Build all possible paths of the required length
- Compute the mismatch count for each path
- Return the minimum one
This works because it exhaustively checks all valid possibilities.
However, it is far too slow.
If the average degree is d and the target length is m, then the number of paths is approximately:
$$O(n \cdot d^{m-1})$$
With m = 100, this becomes completely infeasible.
Optimal Dynamic Programming Approach
The key observation is that the problem has overlapping subproblems.
Suppose we define:
dp[i][city]= minimum edit distance for constructing a valid path of lengthi + 1that ends atcity
To compute this state:
- We consider every previous neighboring city
- Transition from the best previous state
- Add the mismatch cost for the current position
This transforms the problem into shortest path style dynamic programming over:
- path position
- current node
We also store parent pointers so we can reconstruct the final path afterward.
This reduces the exponential search into polynomial time.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n * d^(m-1)) | O(m) | Enumerates all possible paths |
| Optimal DP | O(m * (n + m_edges)) or O(m * m_edges) | O(m * n) | Dynamic programming over positions and nodes |
Where:
m = len(targetPath)m_edges = roads.length
Algorithm Walkthrough
Step 1: Build the Graph
Convert the road list into an adjacency list.
This allows efficient neighbor traversal during DP transitions.
For every road [u, v]:
- add
vtou's adjacency list - add
utov's adjacency list
Since the graph is undirected, both directions matter.
Step 2: Define the DP State
Let:
$$dp[i][city]$$
represent the minimum mismatch cost for a valid path:
- using the first
i + 1positions - ending at
city
We also maintain:
$$parent[i][city]$$
which stores the previous city used to reach this optimal state.
This is necessary for reconstructing the final path later.
Step 3: Initialize the First Layer
At position 0, we may start from any city.
The cost is:
0ifnames[city] == targetPath[0]- otherwise
1
So:
$$dp[0][city] = \begin{cases} 0 & \text{if names[city] matches targetPath[0]} \ 1 & \text{otherwise} \end{cases}$$
Step 4: Perform State Transitions
For each position i from 1 to m - 1:
For each current city curr:
- Compute mismatch cost:
0if names match1otherwise
- Try every neighboring city
prev - Candidate cost becomes:
$$dp[i-1][prev] + mismatch$$
- Take the minimum among all valid predecessors
- Store both:
- the minimum cost
- the predecessor city
Step 5: Find the Best Final City
After filling the DP table:
- inspect all cities at the last position
- choose the city with minimum cost
This gives the endpoint of the optimal path.
Step 6: Reconstruct the Path
Using the parent table:
- Start from the best ending city
- Move backward position by position
- Recover all previous cities
- Reverse the result
The reconstructed sequence is the optimal path.
Why it works
The DP state guarantees that every state stores the minimum possible mismatch cost for paths ending at a specific city and position.
Every valid path of length i + 1 ending at curr must come from some neighboring city at position i - 1. By considering all such neighbors and choosing the minimum transition cost, we ensure optimal substructure.
Since all states are computed optimally and every transition is considered exactly once, the final reconstructed path must have minimum edit distance.
Python Solution
from typing import List
class Solution:
def mostSimilar(
self,
n: int,
roads: List[List[int]],
names: List[str],
targetPath: List[str]
) -> List[int]:
target_length = len(targetPath)
# Build adjacency list
graph = [[] for _ in range(n)]
for u, v in roads:
graph[u].append(v)
graph[v].append(u)
INF = float('inf')
# dp[i][city] = minimum cost to reach city at position i
dp = [[INF] * n for _ in range(target_length)]
# parent[i][city] = previous city used
parent = [[-1] * n for _ in range(target_length)]
# Initialize first position
for city in range(n):
dp[0][city] = 0 if names[city] == targetPath[0] else 1
# Fill DP table
for i in range(1, target_length):
for curr_city in range(n):
mismatch_cost = (
0
if names[curr_city] == targetPath[i]
else 1
)
for prev_city in graph[curr_city]:
candidate = (
dp[i - 1][prev_city]
+ mismatch_cost
)
if candidate < dp[i][curr_city]:
dp[i][curr_city] = candidate
parent[i][curr_city] = prev_city
# Find best ending city
end_city = 0
for city in range(1, n):
if dp[target_length - 1][city] < dp[target_length - 1][end_city]:
end_city = city
# Reconstruct path
path = [0] * target_length
current = end_city
for i in range(target_length - 1, -1, -1):
path[i] = current
current = parent[i][current]
return path
The implementation begins by constructing an adjacency list representation of the graph. This is important because DP transitions depend on quickly enumerating neighboring cities.
The dp table stores the minimum mismatch cost for each position and ending city. Initially, every value is set to infinity so that any valid candidate becomes an improvement.
The first DP layer is initialized independently because the path may start at any city.
For every later position, the algorithm evaluates all possible predecessor cities reachable through a road. Each transition adds either 0 or 1 depending on whether the current city's name matches the target word.
Whenever a better transition is found, the DP value and corresponding parent pointer are updated.
After all states are processed, the algorithm selects the minimum cost city in the final layer and reconstructs the path using the parent table.
Go Solution
package main
import "math"
func mostSimilar(n int, roads [][]int, names []string, targetPath []string) []int {
targetLength := len(targetPath)
// Build adjacency list
graph := make([][]int, n)
for _, road := range roads {
u := road[0]
v := road[1]
graph[u] = append(graph[u], v)
graph[v] = append(graph[v], u)
}
inf := math.MaxInt32
// dp[i][city] = minimum cost
dp := make([][]int, targetLength)
// parent[i][city] = previous city
parent := make([][]int, targetLength)
for i := 0; i < targetLength; i++ {
dp[i] = make([]int, n)
parent[i] = make([]int, n)
for city := 0; city < n; city++ {
dp[i][city] = inf
parent[i][city] = -1
}
}
// Initialize first row
for city := 0; city < n; city++ {
if names[city] == targetPath[0] {
dp[0][city] = 0
} else {
dp[0][city] = 1
}
}
// Fill DP
for i := 1; i < targetLength; i++ {
for currCity := 0; currCity < n; currCity++ {
mismatchCost := 1
if names[currCity] == targetPath[i] {
mismatchCost = 0
}
for _, prevCity := range graph[currCity] {
candidate := dp[i-1][prevCity] + mismatchCost
if candidate < dp[i][currCity] {
dp[i][currCity] = candidate
parent[i][currCity] = prevCity
}
}
}
}
// Find best ending city
endCity := 0
for city := 1; city < n; city++ {
if dp[targetLength-1][city] < dp[targetLength-1][endCity] {
endCity = city
}
}
// Reconstruct path
path := make([]int, targetLength)
current := endCity
for i := targetLength - 1; i >= 0; i-- {
path[i] = current
current = parent[i][current]
}
return path
}
The Go implementation follows the same DP logic as the Python solution.
Slices are used for adjacency lists and DP tables. Since Go does not have built in infinity values for integers, math.MaxInt32 is used as a large sentinel value.
Parent pointers are initialized to -1 to indicate no predecessor.
Go arrays and slices are zero initialized automatically, so explicit initialization is only necessary for the infinity values and parent states.
Worked Examples
Example 1
Input
n = 5
roads = [[0,2],[0,3],[1,2],[1,3],[1,4],[2,4]]
names = ["ATL","PEK","LAX","DXB","HND"]
targetPath = ["ATL","DXB","HND","LAX"]
Graph
| City | Neighbors |
|---|---|
| 0 | 2, 3 |
| 1 | 2, 3, 4 |
| 2 | 0, 1, 4 |
| 3 | 0, 1 |
| 4 | 1, 2 |
DP Initialization
Target word = "ATL"
| City | Name | Cost |
|---|---|---|
| 0 | ATL | 0 |
| 1 | PEK | 1 |
| 2 | LAX | 1 |
| 3 | DXB | 1 |
| 4 | HND | 1 |
Position 1, target = "DXB"
For city 3:
- name matches
"DXB" - mismatch cost = 0
Possible predecessors:
- city 0 → cost = 0 + 0 = 0
- city 1 → cost = 1 + 0 = 1
Best = 0 from city 0.
Position 2, target = "HND"
For city 4:
- name matches
"HND" - mismatch cost = 0
Possible predecessors:
- city 1
- city 2
Best predecessor gives total cost 0.
Position 3, target = "LAX"
For city 2:
- name matches
"LAX" - mismatch cost = 0
Best predecessor becomes city 4.
Final path reconstruction:
2 <- 4 <- 2 <- 0
Reverse:
[0, 2, 4, 2]
Total edit distance = 1.
Example 2
Input
targetPath = ["ABC","DEF","GHI","JKL","MNO","PQR","STU","VWX"]
No city name matches any target word.
Therefore every position contributes cost 1.
Total cost:
8
Any valid path of length 8 is optimal.
One valid answer:
[0,1,0,1,0,1,0,1]
Example 3
Input
names = ["ATL","PEK","LAX","ATL","DXB","HND"]
targetPath = ["ATL","DXB","HND","DXB","ATL","LAX","PEK"]
The graph forms a chain:
0 - 1 - 2 - 3 - 4 - 5
The only perfect matching sequence is:
3 -> 4 -> 5 -> 4 -> 3 -> 2 -> 1
Corresponding names:
ATL -> DXB -> HND -> DXB -> ATL -> LAX -> PEK
Edit distance:
0
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(m * E) | Each DP layer processes all graph edges |
| Space | O(m * n) | DP and parent tables |
Where:
m = len(targetPath)E = roads.length
The algorithm processes each target position independently. For every position, transitions examine all neighboring edges.
Since the graph contains at most n * (n - 1) / 2 edges and n <= 100, this runs comfortably within limits.
The space complexity comes primarily from storing:
- the DP table
- the parent reconstruction table
Both contain m * n entries.
Test Cases
from typing import List
def is_valid_path(path: List[int], roads: List[List[int]]) -> bool:
edges = set()
for u, v in roads:
edges.add((u, v))
edges.add((v, u))
for i in range(len(path) - 1):
if (path[i], path[i + 1]) not in edges:
return False
return True
sol = Solution()
# Example 1
n = 5
roads = [[0,2],[0,3],[1,2],[1,3],[1,4],[2,4]]
names = ["ATL","PEK","LAX","DXB","HND"]
target = ["ATL","DXB","HND","LAX"]
result = sol.mostSimilar(n, roads, names, target)
assert len(result) == 4 # correct length
assert is_valid_path(result, roads) # valid graph path
# Example 2
n = 4
roads = [[1,0],[2,0],[3,0],[2,1],[3,1],[3,2]]
names = ["ATL","PEK","LAX","DXB"]
target = ["ABC","DEF","GHI","JKL","MNO","PQR","STU","VWX"]
result = sol.mostSimilar(n, roads, names, target)
assert len(result) == 8 # long target path
assert is_valid_path(result, roads) # still valid
# Example 3
n = 6
roads = [[0,1],[1,2],[2,3],[3,4],[4,5]]
names = ["ATL","PEK","LAX","ATL","DXB","HND"]
target = ["ATL","DXB","HND","DXB","ATL","LAX","PEK"]
result = sol.mostSimilar(n, roads, names, target)
assert result == [3,4,5,4,3,2,1] # unique optimal path
# Minimum graph size
n = 2
roads = [[0,1]]
names = ["AAA", "BBB"]
target = ["AAA"]
result = sol.mostSimilar(n, roads, names, target)
assert result == [0] # single node path
# Repeated node usage
n = 2
roads = [[0,1]]
names = ["AAA", "BBB"]
target = ["AAA", "BBB", "AAA", "BBB"]
result = sol.mostSimilar(n, roads, names, target)
assert result == [0,1,0,1] # alternating revisits
# Multiple identical names
n = 3
roads = [[0,1],[1,2]]
names = ["AAA", "AAA", "AAA"]
target = ["AAA", "AAA", "AAA"]
result = sol.mostSimilar(n, roads, names, target)
assert len(result) == 3
assert is_valid_path(result, roads)
# Dense graph
n = 4
roads = [
[0,1],[0,2],[0,3],
[1,2],[1,3],[2,3]
]
names = ["AAB","AAC","AAD","AAE"]
target = ["AAB","AAC","AAD","AAE"]
result = sol.mostSimilar(n, roads, names, target)
assert len(result) == 4
assert is_valid_path(result, roads)
Test Summary
| Test | Why |
|---|---|
| Example 1 | Verifies normal DP behavior |
| Example 2 | Verifies all mismatch scenario |
| Example 3 | Verifies exact match reconstruction |
| Minimum graph size | Tests smallest valid input |
| Repeated node usage | Ensures revisits work correctly |
| Multiple identical names | Verifies duplicate names handling |
| Dense graph | Tests many transition choices |
Edge Cases
One important edge case occurs when no city name matches any target string. In this situation, every position contributes a mismatch cost of 1. A naive implementation might incorrectly assume at least one matching node exists and fail to initialize states properly. This implementation handles the case naturally because mismatch cost is computed independently for every node.
Another tricky scenario is when multiple cities share the same name. The algorithm cannot simply map names to unique nodes. Instead, the DP state is indexed by actual city index, not by city name. This ensures different graph positions remain distinguishable even if their names are identical.
A third important edge case involves repeated node visits. Since revisiting is allowed, the optimal path may bounce between a small number of cities many times. Some graph algorithms mistakenly mark nodes as visited globally, which would incorrectly eliminate valid solutions. This implementation never uses a visited set, so repeated traversal is fully supported.
A fourth subtle edge case is path reconstruction. If parent pointers are not stored correctly during DP transitions, the algorithm may compute the correct minimum cost but fail to recover the actual path. The dedicated parent table ensures every optimal transition is remembered and can later be reconstructed exactly.