LeetCode 1923 - Longest Common Subpath

The problem asks us to find the maximum length of a contiguous sequence of cities that appears in every friend's travel path. Each friend has a path represented as an array of city IDs. A subpath is simply a contiguous segment of that array.

LeetCode Problem 1923

Difficulty: 🔴 Hard
Topics: Array, Binary Search, Rolling Hash, Suffix Array, Hash Function

Solution

Problem Understanding

The problem asks us to find the maximum length of a contiguous sequence of cities that appears in every friend's travel path.

Each friend has a path represented as an array of city IDs. A subpath is simply a contiguous segment of that array. We need to determine the longest subpath that is shared across all paths.

For example, if one path is:

[0,1,2,3,4]

then valid subpaths include:

[0]
[1,2]
[2,3,4]
[0,1,2,3]

but not:

[0,2,4]

because subpaths must remain contiguous.

The input contains:

  • n, the total number of cities
  • paths, a list of paths, one for each friend

The output is a single integer representing the maximum possible length of a common subpath that appears in every path.

The constraints are extremely important:

  • Total length of all paths is at most 10^5
  • Number of paths can also be large
  • Individual paths may be long

These limits immediately rule out naive substring comparison approaches. Any algorithm with quadratic or cubic behavior over path lengths will time out.

Another important observation is that the answer must lie between 0 and the length of the shortest path. A common subpath cannot be longer than the smallest path.

There are several edge cases worth considering:

  • Paths with no common city at all, answer is 0
  • Paths where only single cities overlap, answer is 1
  • Multiple repeated cities inside paths
  • Very large paths where efficient hashing becomes necessary
  • Situations where many candidate subpaths exist, requiring fast comparison

The problem guarantees that no city appears consecutively in the same path, but this detail does not significantly change the core algorithm.

Approaches

Brute Force Approach

A straightforward approach is to generate every possible subpath from one path, then check whether it exists in every other path.

Suppose the first path has length L.

The number of subpaths is:

O(L^2)

For each candidate subpath, we would need to search all other paths to determine whether it appears.

A naive substring search itself could take linear time, making the total complexity enormous.

This approach is correct because it explicitly checks every possible common subpath, but it is far too slow for the given constraints.

Key Insight

The critical observation is that the problem asks for the maximum possible length, which strongly suggests binary search on the answer.

If there exists a common subpath of length k, then every smaller length is also possible.

This creates a monotonic property:

possible(k) => possible(k-1)

That allows binary search over the subpath length.

The remaining challenge is efficiently checking whether a common subpath of a given length exists.

To do that, we use rolling hash.

For a fixed length k:

  • Compute hashes of all subpaths of length k in each path
  • Store them in sets
  • Intersect the sets across all paths
  • If the intersection is non-empty, then a common subpath exists

Rolling hash allows us to compute subpath hashes in constant time after preprocessing.

Using double hashing significantly reduces collision probability.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(L³ × m) O(L²) Generates and compares all subpaths directly
Optimal O(T log M) O(T) Binary search with rolling hash
T = total path length, M = shortest path length

Algorithm Walkthrough

Step 1: Determine Search Range

The answer cannot exceed the shortest path length.

So we binary search between:

0 and min(len(path))

This minimizes the search space.

Step 2: Precompute Powers for Rolling Hash

Rolling hash requires powers of the chosen base.

We precompute:

base^0, base^1, ..., base^maxLength

modulo large prime numbers.

We use two mod values for double hashing:

mod1 = 1_000_000_007
mod2 = 1_000_000_009

Double hashing greatly reduces collision risk.

Step 3: Binary Search on Subpath Length

For each midpoint mid:

  • Check whether a common subpath of length mid exists
  • If yes, try larger lengths
  • Otherwise try smaller lengths

This works because of the monotonic property.

Step 4: Compute Rolling Hashes

For a path:

[a,b,c,d,e]

we compute prefix hashes:

H[i]

so any subpath hash can be extracted in constant time.

For a subpath [l:r]:

$hash(l,r)=H[r]-H[l]\cdot base^{(r-l)}$

All operations are performed modulo the chosen prime.

Step 5: Build Hash Sets

For the first path:

  • Compute all hashes of length mid
  • Store them in a set

For every other path:

  • Compute all hashes of length mid
  • Intersect with the current candidate set

If the set becomes empty, no common subpath exists.

Step 6: Return Final Answer

After binary search completes, return the largest valid length found.

Why it works

Binary search is valid because if a common subpath of length k exists, then every smaller length also exists.

Rolling hash guarantees efficient subpath comparison. Instead of comparing arrays directly, we compare compact hash representations.

By intersecting hash sets across all paths, we ensure that only subpaths appearing in every path survive.

Therefore, when the intersection is non-empty, a common subpath exists for that length.

Python Solution

from typing import List

class Solution:
    def longestCommonSubpath(self, n: int, paths: List[List[int]]) -> int:
        min_len = min(len(path) for path in paths)

        base = 100001
        mod1 = 1_000_000_007
        mod2 = 1_000_000_009

        max_len = max(len(path) for path in paths)

        pow1 = [1] * (max_len + 1)
        pow2 = [1] * (max_len + 1)

        for i in range(1, max_len + 1):
            pow1[i] = (pow1[i - 1] * base) % mod1
            pow2[i] = (pow2[i - 1] * base) % mod2

        def get_hashes(path: List[int], length: int):
            if length == 0:
                return {(0, 0)}

            n_path = len(path)

            prefix1 = [0] * (n_path + 1)
            prefix2 = [0] * (n_path + 1)

            for i in range(n_path):
                prefix1[i + 1] = (prefix1[i] * base + path[i] + 1) % mod1
                prefix2[i + 1] = (prefix2[i] * base + path[i] + 1) % mod2

            hashes = set()

            for i in range(n_path - length + 1):
                j = i + length

                hash1 = (
                    prefix1[j]
                    - prefix1[i] * pow1[length]
                ) % mod1

                hash2 = (
                    prefix2[j]
                    - prefix2[i] * pow2[length]
                ) % mod2

                hashes.add((hash1, hash2))

            return hashes

        def exists(length: int) -> bool:
            common = get_hashes(paths[0], length)

            for i in range(1, len(paths)):
                current_hashes = get_hashes(paths[i], length)
                common &= current_hashes

                if not common:
                    return False

            return True

        left = 0
        right = min_len
        answer = 0

        while left <= right:
            mid = (left + right) // 2

            if exists(mid):
                answer = mid
                left = mid + 1
            else:
                right = mid - 1

        return answer

The implementation begins by finding the minimum path length, because the answer cannot exceed it.

Next, the code precomputes powers of the rolling hash base for both moduli. This allows constant time substring hash extraction later.

The get_hashes function computes all rolling hashes for subpaths of a fixed length within a single path. It uses prefix hashes so each subpath hash is extracted in constant time.

The exists function determines whether a common subpath of a particular length exists. It begins with all hashes from the first path, then repeatedly intersects with hashes from the remaining paths. If the intersection ever becomes empty, no common subpath exists for that length.

Finally, binary search finds the largest valid length.

Go Solution

package main

func longestCommonSubpath(n int, paths [][]int) int {
	minLen := len(paths[0])

	maxLen := 0

	for _, path := range paths {
		if len(path) < minLen {
			minLen = len(path)
		}
		if len(path) > maxLen {
			maxLen = len(path)
		}
	}

	const base int64 = 100001
	const mod1 int64 = 1_000_000_007
	const mod2 int64 = 1_000_000_009

	pow1 := make([]int64, maxLen+1)
	pow2 := make([]int64, maxLen+1)

	pow1[0] = 1
	pow2[0] = 1

	for i := 1; i <= maxLen; i++ {
		pow1[i] = (pow1[i-1] * base) % mod1
		pow2[i] = (pow2[i-1] * base) % mod2
	}

	type Pair struct {
		a int64
		b int64
	}

	getHashes := func(path []int, length int) map[Pair]struct{} {
		hashes := make(map[Pair]struct{})

		if length == 0 {
			hashes[Pair{0, 0}] = struct{}{}
			return hashes
		}

		nPath := len(path)

		prefix1 := make([]int64, nPath+1)
		prefix2 := make([]int64, nPath+1)

		for i := 0; i < nPath; i++ {
			prefix1[i+1] = (prefix1[i]*base + int64(path[i]+1)) % mod1
			prefix2[i+1] = (prefix2[i]*base + int64(path[i]+1)) % mod2
		}

		for i := 0; i+length <= nPath; i++ {
			j := i + length

			hash1 := (prefix1[j] - prefix1[i]*pow1[length]) % mod1
			hash2 := (prefix2[j] - prefix2[i]*pow2[length]) % mod2

			if hash1 < 0 {
				hash1 += mod1
			}

			if hash2 < 0 {
				hash2 += mod2
			}

			hashes[Pair{hash1, hash2}] = struct{}{}
		}

		return hashes
	}

	exists := func(length int) bool {
		common := getHashes(paths[0], length)

		for i := 1; i < len(paths); i++ {
			current := getHashes(paths[i], length)

			nextCommon := make(map[Pair]struct{})

			for key := range common {
				if _, ok := current[key]; ok {
					nextCommon[key] = struct{}{}
				}
			}

			common = nextCommon

			if len(common) == 0 {
				return false
			}
		}

		return true
	}

	left := 0
	right := minLen
	answer := 0

	for left <= right {
		mid := (left + right) / 2

		if exists(mid) {
			answer = mid
			left = mid + 1
		} else {
			right = mid - 1
		}
	}

	return answer
}

The Go version follows the same logic as the Python implementation.

One important difference is handling modular subtraction. In Go, % may produce negative values, so we explicitly add the modulus back when necessary.

The Go solution also uses a custom Pair struct to store double hashes inside maps.

Worked Examples

Example 1

n = 5
paths = [
  [0,1,2,3,4],
  [2,3,4],
  [4,0,1,2,3]
]

Shortest path length is:

3

Binary search range:

[0,3]

Check length = 1

Hashes correspond to:

[0], [1], [2], [3], [4]

Common intersection becomes:

[2], [3], [4]

Valid.

Check length = 2

First path subpaths:

[0,1]
[1,2]
[2,3]
[3,4]

Second path subpaths:

[2,3]
[3,4]

Intersection:

[2,3]
[3,4]

Third path subpaths:

[4,0]
[0,1]
[1,2]
[2,3]

Final intersection:

[2,3]

Valid.

Check length = 3

Second path only contains:

[2,3,4]

Third path does not contain it.

Invalid.

Answer:

2

Example 2

paths = [[0],[1],[2]]

Length 1 candidates:

[0]
[1]
[2]

No intersection exists.

Answer:

0

Example 3

paths = [
  [0,1,2,3,4],
  [4,3,2,1,0]
]

Length 1 works because all single cities appear.

Length 2 fails because ordering differs.

Answer:

1

Complexity Analysis

Measure Complexity Explanation
Time O(T log M) Binary search over lengths, each check scans all paths
Space O(T) Hash sets and prefix arrays
T = total path length, M = shortest path length

For each binary search iteration, every path is processed once to compute rolling hashes. Since binary search performs log M iterations, the overall complexity becomes O(T log M).

The space usage comes primarily from storing hash sets and prefix arrays.

Test Cases

from typing import List

s = Solution()

assert s.longestCommonSubpath(
    5,
    [[0,1,2,3,4],[2,3,4],[4,0,1,2,3]]
) == 2  # basic shared subpath

assert s.longestCommonSubpath(
    3,
    [[0],[1],[2]]
) == 0  # no overlap at all

assert s.longestCommonSubpath(
    5,
    [[0,1,2,3,4],[4,3,2,1,0]]
) == 1  # only single elements overlap

assert s.longestCommonSubpath(
    5,
    [[1,2,3],[1,2,3],[1,2,3]]
) == 3  # identical paths

assert s.longestCommonSubpath(
    5,
    [[1],[1],[1]]
) == 1  # smallest non-zero case

assert s.longestCommonSubpath(
    10,
    [[1,2,1,2,1],[1,2,1,2],[2,1,2,1]]
) == 3  # repeated patterns

assert s.longestCommonSubpath(
    10,
    [[0,1,2,3],[2,3,4,5],[1,2,3,6]]
) == 2  # shared middle segment

assert s.longestCommonSubpath(
    10,
    [[7,8,9],[7,8,9],[7,8]]
) == 2  # shortest path limits answer

assert s.longestCommonSubpath(
    10,
    [[1,2,3,4,5],[6,7,8,9,0]]
) == 0  # completely disjoint

assert s.longestCommonSubpath(
    10,
    [[0,1,0,1,0],[1,0,1,0,1],[0,1,0,1]]
) == 4  # overlapping repeated sequences
Test Why
Example 1 Standard shared subpath
Example 2 No common subpath
Example 3 Only length 1 possible
Identical paths Maximum possible answer
Single-element paths Smallest valid input
Repeated patterns Tests rolling hash correctness
Shared middle segment Verifies internal matching
Shortest path limit Ensures upper bound correctness
Disjoint paths Validates zero result
Repeated overlap Stress test for duplicate hashes

Edge Cases

One important edge case occurs when there is no shared city at all. In this situation, even subpaths of length 1 fail. A buggy implementation might incorrectly assume some overlap exists due to hash collisions or improper set handling. The double hashing approach and explicit set intersection guarantee correctness.

Another critical edge case involves repeated patterns such as:

[1,2,1,2,1]

Rolling hash implementations can accidentally mishandle overlapping windows or compute incorrect prefix ranges. The implementation carefully computes every fixed-length subpath independently using prefix hashes, ensuring overlapping sequences are processed correctly.

A third important case occurs when all paths are identical. The correct answer should equal the full path length. Binary search must therefore allow the upper boundary itself to be tested. Using:

while left <= right

ensures the maximum possible value is considered.

Finally, paths of very different lengths can create subtle bugs. Since a common subpath cannot exceed the shortest path, the algorithm explicitly limits binary search to the minimum path length. This prevents invalid window computations and unnecessary work.