LeetCode 2573 - Find the String with LCP

Difficulty: 🔴 Hard
Topics: Array, String, Dynamic Programming, Greedy, Union-Find, Matrix

Solution

Problem Understanding

The problem gives us an n x n matrix called lcp, where:

lcp[i][j] represents the length of the longest common prefix between:
the suffix starting at index i
the suffix starting at index j

If the unknown string is word, then:

suffix i is word[i:]
suffix j is word[j:]

The task is to reconstruct the lexicographically smallest valid string word that exactly matches the given lcp matrix. If no such string exists, we must return an empty string.

For example, suppose:

word = "abab"

Then:

word[0:] = "abab"
word[2:] = "ab"

Their common prefix is "ab", whose length is 2, so:

lcp[0][2] = 2

The matrix completely describes how all suffixes relate to each other.

The constraints are important:

n <= 1000
The matrix has n² entries

This immediately tells us that:

An O(n²) solution is acceptable
Anything close to O(n³) will likely time out

Another important observation is that the answer only uses lowercase English letters, meaning we have at most 26 distinct character groups available. If constructing the required equality relationships needs more than 26 different characters, the answer is impossible.

Several edge cases can easily break naive implementations:

Diagonal values must satisfy:

lcp[i][i] = n - i

because a suffix always fully matches itself.

The matrix must be symmetric:

lcp[i][j] = lcp[j][i]

If lcp[i][j] > 0, then:

word[i] == word[j]

If lcp[i][j] = k > 0, then:

lcp[i+1][j+1] = k - 1

unless indices go out of bounds.

The problem is essentially asking us to reconstruct a string from pairwise suffix-prefix relationships while ensuring all matrix entries remain consistent.

Approaches

Brute Force Approach

A brute force solution would attempt to generate all possible strings of length n using lowercase letters, compute their lcp matrices, and compare them with the input matrix.

To compute the matrix for one candidate string:

For every pair (i, j)
Compare suffixes character by character
Store the longest common prefix length

This approach is obviously correct because it checks every possible string. If a valid string exists, brute force will eventually find it.

However, it is completely infeasible.

Even if we restricted ourselves to only two letters, there would already be:

2^1000

possible strings in the worst case.

Using all 26 lowercase letters makes the search astronomically large.

Key Insight

The important observation is:

lcp[i][j] > 0

immediately implies:

word[i] == word[j]

because the suffixes share at least one matching character.

This converts the problem into a grouping problem:

All positions that must contain the same character belong to the same equivalence class.
We can build these groups using Union-Find.

After grouping equal positions:

Assign characters greedily from left to right
Use 'a', then 'b', then 'c', etc.
This guarantees lexicographically smallest construction

Finally, we must verify that the constructed string reproduces the exact lcp matrix.

The verification step is critical because equality constraints alone are insufficient. A malformed matrix may still produce consistent equality groups but violate deeper suffix relationships.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	Exponential	O(n²)	Tries all possible strings
Optimal	O(n² α(n))	O(n²)	Uses Union-Find and dynamic verification

Algorithm Walkthrough

Step 1: Validate Basic Diagonal Constraints

For every index i, verify:

lcp[i][i] == n - i

A suffix always matches itself completely.

If any diagonal value is incorrect, return "".

Step 2: Build Equality Groups Using Union-Find

If:

lcp[i][j] > 0

then the first characters of both suffixes must match:

word[i] == word[j]

We union indices i and j.

Union-Find is ideal here because:

It efficiently merges equivalence classes
It supports near constant time queries
It naturally models equality relationships

After processing all matrix entries, each connected component represents positions that must share the same character.

Step 3: Assign Characters Greedily

We now construct the lexicographically smallest valid string.

Traverse indices from left to right:

If a component has not yet received a character:
Assign the next unused character
Reuse that character for all positions in the same component

This greedy assignment works because earlier positions dominate lexicographic ordering.

If more than 26 distinct components are required, return "".

Step 4: Recompute the LCP Matrix

Now we verify the candidate string.

Define:

dp[i][j]

as the longest common prefix length between suffixes starting at i and j.

We compute it bottom up:

If word[i] != word[j]

dp[i][j] = 0

Otherwise:

dp[i][j] = 1 + dp[i+1][j+1]

This recurrence works because matching prefixes extend one step forward.

We process indices from bottom right toward top left so that:

dp[i+1][j+1]

is already known.

Step 5: Compare With Input Matrix

If any computed value differs from the given matrix:

dp[i][j] != lcp[i][j]

the matrix is inconsistent, so return "".

Otherwise, return the constructed string.

Why it works

The algorithm works because the lcp matrix imposes two kinds of constraints:

Equality constraints on characters
Recursive consistency constraints on suffix matches

Union-Find correctly captures all mandatory equalities. Greedy character assignment guarantees the lexicographically smallest valid string. The final dynamic programming verification ensures every lcp value matches exactly, eliminating invalid constructions caused by inconsistent matrices.

Python Solution

from typing import List

class Solution:
    def findTheString(self, lcp: List[List[int]]) -> str:
        n = len(lcp)

        parent = list(range(n))

        def find(x: int) -> int:
            while parent[x] != x:
                parent[x] = parent[parent[x]]
                x = parent[x]
            return x

        def union(a: int, b: int) -> None:
            pa = find(a)
            pb = find(b)

            if pa != pb:
                parent[pb] = pa

        # Step 1: validate diagonals
        for i in range(n):
            if lcp[i][i] != n - i:
                return ""

        # Step 2: union positions that must match
        for i in range(n):
            for j in range(i + 1, n):
                if lcp[i][j] > 0:
                    union(i, j)

        # Step 3: build lexicographically smallest string
        result = [''] * n
        group_to_char = {}

        next_char = ord('a')

        for i in range(n):
            root = find(i)

            if root not in group_to_char:
                if next_char > ord('z'):
                    return ""

                group_to_char[root] = chr(next_char)
                next_char += 1

            result[i] = group_to_char[root]

        word = ''.join(result)

        # Step 4: recompute lcp
        dp = [[0] * (n + 1) for _ in range(n + 1)]

        for i in range(n - 1, -1, -1):
            for j in range(n - 1, -1, -1):
                if word[i] == word[j]:
                    dp[i][j] = 1 + dp[i + 1][j + 1]

        # Step 5: verify
        for i in range(n):
            for j in range(n):
                if dp[i][j] != lcp[i][j]:
                    return ""

        return word

The implementation begins by initializing a Union-Find structure where each position initially belongs to its own group.

The diagonal validation immediately rejects impossible matrices. This catches invalid self-match lengths early.

Next, every pair with positive lcp is merged into the same component because their first characters must match.

The string is then constructed greedily. Each newly encountered component receives the smallest unused character. Since we process indices from left to right, this produces the lexicographically smallest valid answer.

After construction, the algorithm recomputes the entire lcp matrix using bottom-up dynamic programming. The recurrence:

dp[i][j] = 1 + dp[i+1][j+1]

naturally models matching suffix prefixes.

Finally, the computed matrix is compared against the input. Any mismatch proves the matrix is inconsistent.

Go Solution

package main

func findTheString(lcp [][]int) string {
	n := len(lcp)

	parent := make([]int, n)
	for i := 0; i < n; i++ {
		parent[i] = i
	}

	var find func(int) int
	find = func(x int) int {
		for parent[x] != x {
			parent[x] = parent[parent[x]]
			x = parent[x]
		}
		return x
	}

	union := func(a, b int) {
		pa := find(a)
		pb := find(b)

		if pa != pb {
			parent[pb] = pa
		}
	}

	// Step 1: validate diagonals
	for i := 0; i < n; i++ {
		if lcp[i][i] != n-i {
			return ""
		}
	}

	// Step 2: union matching positions
	for i := 0; i < n; i++ {
		for j := i + 1; j < n; j++ {
			if lcp[i][j] > 0 {
				union(i, j)
			}
		}
	}

	// Step 3: construct smallest string
	result := make([]byte, n)

	groupToChar := make(map[int]byte)

	nextChar := byte('a')

	for i := 0; i < n; i++ {
		root := find(i)

		if _, exists := groupToChar[root]; !exists {
			if nextChar > 'z' {
				return ""
			}

			groupToChar[root] = nextChar
			nextChar++
		}

		result[i] = groupToChar[root]
	}

	word := string(result)

	// Step 4: recompute lcp
	dp := make([][]int, n+1)
	for i := range dp {
		dp[i] = make([]int, n+1)
	}

	for i := n - 1; i >= 0; i-- {
		for j := n - 1; j >= 0; j-- {
			if word[i] == word[j] {
				dp[i][j] = 1 + dp[i+1][j+1]
			}
		}
	}

	// Step 5: verify
	for i := 0; i < n; i++ {
		for j := 0; j < n; j++ {
			if dp[i][j] != lcp[i][j] {
				return ""
			}
		}
	}

	return word
}

The Go implementation closely mirrors the Python version. The primary difference is explicit slice management and byte handling.

The constructed string uses a []byte slice because characters are ASCII lowercase letters. This is more efficient than repeatedly concatenating strings.

The dynamic programming matrix is allocated as a two-dimensional slice. Since Go initializes integers to zero automatically, no additional setup is required.

No integer overflow concerns exist because all values are bounded by n <= 1000.

Worked Examples

Example 1

lcp =
[
 [4,0,2,0],
 [0,3,0,1],
 [2,0,2,0],
 [0,1,0,1]
]

Step 1: Diagonal Validation

Index	Expected	Actual
0	4	4
1	3	3
2	2	2
3	1	1

All valid.

Step 2: Build Union Groups

Positive entries:

Pair	Action
(0,2)	union
(1,3)	union

Groups become:

{0,2}
{1,3}

Step 3: Assign Characters

Index	Group	Character
0	{0,2}	a
1	{1,3}	b
2	{0,2}	a
3	{1,3}	b

Constructed string:

"abab"

Step 4: Recompute LCP

i	j	Value
0	0	4
0	2	2
1	3	1

All entries match the input matrix.

Final answer:

"abab"

Example 2

lcp =
[
 [4,3,2,1],
 [3,3,2,1],
 [2,2,2,1],
 [1,1,1,1]
]

Union Construction

Every pair has positive overlap, so all indices merge into one group:

{0,1,2,3}

Character Assignment

All positions receive 'a'.

Constructed string:

"aaaa"

Verification succeeds.

Example 3

lcp =
[
 [4,3,2,1],
 [3,3,2,1],
 [2,2,2,1],
 [1,1,1,3]
]

Diagonal Check

At index 3:

Expected: 1
Actual: 3

Impossible because the suffix length is only 1.

Return:

""

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n² α(n))	Union operations plus DP verification
Space	O(n²)	DP matrix dominates memory usage

The Union-Find operations are nearly constant time because of path compression, giving amortized complexity α(n), where α is the inverse Ackermann function.

The dominant work comes from:

Processing all matrix entries
Recomputing the full lcp matrix

Both require O(n²) time.

The DP table stores (n+1)² integers, so space complexity is also O(n²).

Test Cases

from typing import List

class Solution:
    def findTheString(self, lcp: List[List[int]]) -> str:
        n = len(lcp)

        parent = list(range(n))

        def find(x):
            while parent[x] != x:
                parent[x] = parent[parent[x]]
                x = parent[x]
            return x

        def union(a, b):
            pa = find(a)
            pb = find(b)

            if pa != pb:
                parent[pb] = pa

        for i in range(n):
            if lcp[i][i] != n - i:
                return ""

        for i in range(n):
            for j in range(i + 1, n):
                if lcp[i][j] > 0:
                    union(i, j)

        result = [''] * n
        group_to_char = {}

        next_char = ord('a')

        for i in range(n):
            root = find(i)

            if root not in group_to_char:
                if next_char > ord('z'):
                    return ""

                group_to_char[root] = chr(next_char)
                next_char += 1

            result[i] = group_to_char[root]

        word = ''.join(result)

        dp = [[0] * (n + 1) for _ in range(n + 1)]

        for i in range(n - 1, -1, -1):
            for j in range(n - 1, -1, -1):
                if word[i] == word[j]:
                    dp[i][j] = 1 + dp[i + 1][j + 1]

        for i in range(n):
            for j in range(n):
                if dp[i][j] != lcp[i][j]:
                    return ""

        return word

sol = Solution()

# Example 1
assert sol.findTheString([
    [4,0,2,0],
    [0,3,0,1],
    [2,0,2,0],
    [0,1,0,1]
]) == "abab"

# Example 2
assert sol.findTheString([
    [4,3,2,1],
    [3,3,2,1],
    [2,2,2,1],
    [1,1,1,1]
]) == "aaaa"

# Example 3, invalid diagonal
assert sol.findTheString([
    [4,3,2,1],
    [3,3,2,1],
    [2,2,2,1],
    [1,1,1,3]
]) == ""

# Single character
assert sol.findTheString([
    [1]
]) == "a"

# Two different characters
assert sol.findTheString([
    [2,0],
    [0,1]
]) == "ab"

# Contradictory matrix
assert sol.findTheString([
    [3,1,1],
    [1,2,0],
    [1,0,1]
]) == ""

# Requires many distinct groups but still valid
assert sol.findTheString([
    [1,0,0],
    [0,1,0],
    [0,0,1]
]) == "abc"

# Impossible because more than 26 groups required
n = 27
matrix = [[0] * n for _ in range(n)]

for i in range(n):
    matrix[i][i] = 1

assert sol.findTheString(matrix) == ""

Test	Why
Alternating pattern	Validates separated equality groups
All same letters	Validates full merging
Invalid diagonal	Ensures self-match validation
Single character	Smallest possible input
Two distinct letters	Simple independent groups
Contradictory matrix	Detects recursive inconsistency
Fully distinct characters	Validates greedy assignment
More than 26 groups	Ensures alphabet limit handling

Edge Cases

Invalid Diagonal Values

A common source of bugs is forgetting that:

lcp[i][i] = n - i

must always hold.

For example:

lcp[3][3] = 3

is impossible when only one character remains in the suffix.

The implementation immediately validates all diagonal entries before doing any further processing.

Equality Constraints That Cause Contradictions

It is possible for pairwise equality relationships to appear valid while deeper suffix relationships remain inconsistent.

For example:

lcp[i][j] = 1

means the first characters match, but the next characters must differ. If later matrix entries imply they also match, the matrix becomes contradictory.

The final DP verification step catches all such inconsistencies.

More Than 26 Distinct Components

The problem restricts characters to lowercase English letters.

If the Union-Find structure produces more than 26 disconnected components, no valid string exists.

The greedy assignment step explicitly checks this condition and returns "" when necessary.