LeetCode 2573 - Find the String with LCP
The problem gives us an n x n matrix called lcp, where: - lcp[i][j] represents the length of the longest common prefix between: - the suffix starting at index i - the suffix starting at index j If the unknown string is word, then: - suffix i is word[i:] - suffix j is word[j:]…
Difficulty: 🔴 Hard
Topics: Array, String, Dynamic Programming, Greedy, Union-Find, Matrix
Solution
Problem Understanding
The problem gives us an n x n matrix called lcp, where:
-
lcp[i][j]represents the length of the longest common prefix between: -
the suffix starting at index
i -
the suffix starting at index
j
If the unknown string is word, then:
- suffix
iisword[i:] - suffix
jisword[j:]
The task is to reconstruct the lexicographically smallest valid string word that exactly matches the given lcp matrix. If no such string exists, we must return an empty string.
For example, suppose:
word = "abab"
Then:
word[0:] = "abab"
word[2:] = "ab"
Their common prefix is "ab", whose length is 2, so:
lcp[0][2] = 2
The matrix completely describes how all suffixes relate to each other.
The constraints are important:
n <= 1000- The matrix has
n²entries
This immediately tells us that:
- An
O(n²)solution is acceptable - Anything close to
O(n³)will likely time out
Another important observation is that the answer only uses lowercase English letters, meaning we have at most 26 distinct character groups available. If constructing the required equality relationships needs more than 26 different characters, the answer is impossible.
Several edge cases can easily break naive implementations:
- Diagonal values must satisfy:
lcp[i][i] = n - i
because a suffix always fully matches itself.
- The matrix must be symmetric:
lcp[i][j] = lcp[j][i]
- If
lcp[i][j] > 0, then:
word[i] == word[j]
- If
lcp[i][j] = k > 0, then:
lcp[i+1][j+1] = k - 1
unless indices go out of bounds.
The problem is essentially asking us to reconstruct a string from pairwise suffix-prefix relationships while ensuring all matrix entries remain consistent.
Approaches
Brute Force Approach
A brute force solution would attempt to generate all possible strings of length n using lowercase letters, compute their lcp matrices, and compare them with the input matrix.
To compute the matrix for one candidate string:
- For every pair
(i, j) - Compare suffixes character by character
- Store the longest common prefix length
This approach is obviously correct because it checks every possible string. If a valid string exists, brute force will eventually find it.
However, it is completely infeasible.
Even if we restricted ourselves to only two letters, there would already be:
2^1000
possible strings in the worst case.
Using all 26 lowercase letters makes the search astronomically large.
Key Insight
The important observation is:
lcp[i][j] > 0
immediately implies:
word[i] == word[j]
because the suffixes share at least one matching character.
This converts the problem into a grouping problem:
- All positions that must contain the same character belong to the same equivalence class.
- We can build these groups using Union-Find.
After grouping equal positions:
- Assign characters greedily from left to right
- Use
'a', then'b', then'c', etc. - This guarantees lexicographically smallest construction
Finally, we must verify that the constructed string reproduces the exact lcp matrix.
The verification step is critical because equality constraints alone are insufficient. A malformed matrix may still produce consistent equality groups but violate deeper suffix relationships.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | Exponential | O(n²) | Tries all possible strings |
| Optimal | O(n² α(n)) | O(n²) | Uses Union-Find and dynamic verification |
Algorithm Walkthrough
Step 1: Validate Basic Diagonal Constraints
For every index i, verify:
lcp[i][i] == n - i
A suffix always matches itself completely.
If any diagonal value is incorrect, return "".
Step 2: Build Equality Groups Using Union-Find
If:
lcp[i][j] > 0
then the first characters of both suffixes must match:
word[i] == word[j]
We union indices i and j.
Union-Find is ideal here because:
- It efficiently merges equivalence classes
- It supports near constant time queries
- It naturally models equality relationships
After processing all matrix entries, each connected component represents positions that must share the same character.
Step 3: Assign Characters Greedily
We now construct the lexicographically smallest valid string.
Traverse indices from left to right:
-
If a component has not yet received a character:
-
Assign the next unused character
-
Reuse that character for all positions in the same component
This greedy assignment works because earlier positions dominate lexicographic ordering.
If more than 26 distinct components are required, return "".
Step 4: Recompute the LCP Matrix
Now we verify the candidate string.
Define:
dp[i][j]
as the longest common prefix length between suffixes starting at i and j.
We compute it bottom up:
- If
word[i] != word[j]
dp[i][j] = 0
- Otherwise:
dp[i][j] = 1 + dp[i+1][j+1]
This recurrence works because matching prefixes extend one step forward.
We process indices from bottom right toward top left so that:
dp[i+1][j+1]
is already known.
Step 5: Compare With Input Matrix
If any computed value differs from the given matrix:
dp[i][j] != lcp[i][j]
the matrix is inconsistent, so return "".
Otherwise, return the constructed string.
Why it works
The algorithm works because the lcp matrix imposes two kinds of constraints:
- Equality constraints on characters
- Recursive consistency constraints on suffix matches
Union-Find correctly captures all mandatory equalities. Greedy character assignment guarantees the lexicographically smallest valid string. The final dynamic programming verification ensures every lcp value matches exactly, eliminating invalid constructions caused by inconsistent matrices.
Python Solution
from typing import List
class Solution:
def findTheString(self, lcp: List[List[int]]) -> str:
n = len(lcp)
parent = list(range(n))
def find(x: int) -> int:
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(a: int, b: int) -> None:
pa = find(a)
pb = find(b)
if pa != pb:
parent[pb] = pa
# Step 1: validate diagonals
for i in range(n):
if lcp[i][i] != n - i:
return ""
# Step 2: union positions that must match
for i in range(n):
for j in range(i + 1, n):
if lcp[i][j] > 0:
union(i, j)
# Step 3: build lexicographically smallest string
result = [''] * n
group_to_char = {}
next_char = ord('a')
for i in range(n):
root = find(i)
if root not in group_to_char:
if next_char > ord('z'):
return ""
group_to_char[root] = chr(next_char)
next_char += 1
result[i] = group_to_char[root]
word = ''.join(result)
# Step 4: recompute lcp
dp = [[0] * (n + 1) for _ in range(n + 1)]
for i in range(n - 1, -1, -1):
for j in range(n - 1, -1, -1):
if word[i] == word[j]:
dp[i][j] = 1 + dp[i + 1][j + 1]
# Step 5: verify
for i in range(n):
for j in range(n):
if dp[i][j] != lcp[i][j]:
return ""
return word
The implementation begins by initializing a Union-Find structure where each position initially belongs to its own group.
The diagonal validation immediately rejects impossible matrices. This catches invalid self-match lengths early.
Next, every pair with positive lcp is merged into the same component because their first characters must match.
The string is then constructed greedily. Each newly encountered component receives the smallest unused character. Since we process indices from left to right, this produces the lexicographically smallest valid answer.
After construction, the algorithm recomputes the entire lcp matrix using bottom-up dynamic programming. The recurrence:
dp[i][j] = 1 + dp[i+1][j+1]
naturally models matching suffix prefixes.
Finally, the computed matrix is compared against the input. Any mismatch proves the matrix is inconsistent.
Go Solution
package main
func findTheString(lcp [][]int) string {
n := len(lcp)
parent := make([]int, n)
for i := 0; i < n; i++ {
parent[i] = i
}
var find func(int) int
find = func(x int) int {
for parent[x] != x {
parent[x] = parent[parent[x]]
x = parent[x]
}
return x
}
union := func(a, b int) {
pa := find(a)
pb := find(b)
if pa != pb {
parent[pb] = pa
}
}
// Step 1: validate diagonals
for i := 0; i < n; i++ {
if lcp[i][i] != n-i {
return ""
}
}
// Step 2: union matching positions
for i := 0; i < n; i++ {
for j := i + 1; j < n; j++ {
if lcp[i][j] > 0 {
union(i, j)
}
}
}
// Step 3: construct smallest string
result := make([]byte, n)
groupToChar := make(map[int]byte)
nextChar := byte('a')
for i := 0; i < n; i++ {
root := find(i)
if _, exists := groupToChar[root]; !exists {
if nextChar > 'z' {
return ""
}
groupToChar[root] = nextChar
nextChar++
}
result[i] = groupToChar[root]
}
word := string(result)
// Step 4: recompute lcp
dp := make([][]int, n+1)
for i := range dp {
dp[i] = make([]int, n+1)
}
for i := n - 1; i >= 0; i-- {
for j := n - 1; j >= 0; j-- {
if word[i] == word[j] {
dp[i][j] = 1 + dp[i+1][j+1]
}
}
}
// Step 5: verify
for i := 0; i < n; i++ {
for j := 0; j < n; j++ {
if dp[i][j] != lcp[i][j] {
return ""
}
}
}
return word
}
The Go implementation closely mirrors the Python version. The primary difference is explicit slice management and byte handling.
The constructed string uses a []byte slice because characters are ASCII lowercase letters. This is more efficient than repeatedly concatenating strings.
The dynamic programming matrix is allocated as a two-dimensional slice. Since Go initializes integers to zero automatically, no additional setup is required.
No integer overflow concerns exist because all values are bounded by n <= 1000.
Worked Examples
Example 1
lcp =
[
[4,0,2,0],
[0,3,0,1],
[2,0,2,0],
[0,1,0,1]
]
Step 1: Diagonal Validation
| Index | Expected | Actual |
|---|---|---|
| 0 | 4 | 4 |
| 1 | 3 | 3 |
| 2 | 2 | 2 |
| 3 | 1 | 1 |
All valid.
Step 2: Build Union Groups
Positive entries:
| Pair | Action |
|---|---|
| (0,2) | union |
| (1,3) | union |
Groups become:
{0,2}
{1,3}
Step 3: Assign Characters
| Index | Group | Character |
|---|---|---|
| 0 | {0,2} | a |
| 1 | {1,3} | b |
| 2 | {0,2} | a |
| 3 | {1,3} | b |
Constructed string:
"abab"
Step 4: Recompute LCP
| i | j | Value |
|---|---|---|
| 0 | 0 | 4 |
| 0 | 2 | 2 |
| 1 | 3 | 1 |
All entries match the input matrix.
Final answer:
"abab"
Example 2
lcp =
[
[4,3,2,1],
[3,3,2,1],
[2,2,2,1],
[1,1,1,1]
]
Union Construction
Every pair has positive overlap, so all indices merge into one group:
{0,1,2,3}
Character Assignment
All positions receive 'a'.
Constructed string:
"aaaa"
Verification succeeds.
Example 3
lcp =
[
[4,3,2,1],
[3,3,2,1],
[2,2,2,1],
[1,1,1,3]
]
Diagonal Check
At index 3:
Expected: 1
Actual: 3
Impossible because the suffix length is only 1.
Return:
""
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n² α(n)) | Union operations plus DP verification |
| Space | O(n²) | DP matrix dominates memory usage |
The Union-Find operations are nearly constant time because of path compression, giving amortized complexity α(n), where α is the inverse Ackermann function.
The dominant work comes from:
- Processing all matrix entries
- Recomputing the full
lcpmatrix
Both require O(n²) time.
The DP table stores (n+1)² integers, so space complexity is also O(n²).
Test Cases
from typing import List
class Solution:
def findTheString(self, lcp: List[List[int]]) -> str:
n = len(lcp)
parent = list(range(n))
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(a, b):
pa = find(a)
pb = find(b)
if pa != pb:
parent[pb] = pa
for i in range(n):
if lcp[i][i] != n - i:
return ""
for i in range(n):
for j in range(i + 1, n):
if lcp[i][j] > 0:
union(i, j)
result = [''] * n
group_to_char = {}
next_char = ord('a')
for i in range(n):
root = find(i)
if root not in group_to_char:
if next_char > ord('z'):
return ""
group_to_char[root] = chr(next_char)
next_char += 1
result[i] = group_to_char[root]
word = ''.join(result)
dp = [[0] * (n + 1) for _ in range(n + 1)]
for i in range(n - 1, -1, -1):
for j in range(n - 1, -1, -1):
if word[i] == word[j]:
dp[i][j] = 1 + dp[i + 1][j + 1]
for i in range(n):
for j in range(n):
if dp[i][j] != lcp[i][j]:
return ""
return word
sol = Solution()
# Example 1
assert sol.findTheString([
[4,0,2,0],
[0,3,0,1],
[2,0,2,0],
[0,1,0,1]
]) == "abab"
# Example 2
assert sol.findTheString([
[4,3,2,1],
[3,3,2,1],
[2,2,2,1],
[1,1,1,1]
]) == "aaaa"
# Example 3, invalid diagonal
assert sol.findTheString([
[4,3,2,1],
[3,3,2,1],
[2,2,2,1],
[1,1,1,3]
]) == ""
# Single character
assert sol.findTheString([
[1]
]) == "a"
# Two different characters
assert sol.findTheString([
[2,0],
[0,1]
]) == "ab"
# Contradictory matrix
assert sol.findTheString([
[3,1,1],
[1,2,0],
[1,0,1]
]) == ""
# Requires many distinct groups but still valid
assert sol.findTheString([
[1,0,0],
[0,1,0],
[0,0,1]
]) == "abc"
# Impossible because more than 26 groups required
n = 27
matrix = [[0] * n for _ in range(n)]
for i in range(n):
matrix[i][i] = 1
assert sol.findTheString(matrix) == ""
| Test | Why |
|---|---|
| Alternating pattern | Validates separated equality groups |
| All same letters | Validates full merging |
| Invalid diagonal | Ensures self-match validation |
| Single character | Smallest possible input |
| Two distinct letters | Simple independent groups |
| Contradictory matrix | Detects recursive inconsistency |
| Fully distinct characters | Validates greedy assignment |
| More than 26 groups | Ensures alphabet limit handling |
Edge Cases
Invalid Diagonal Values
A common source of bugs is forgetting that:
lcp[i][i] = n - i
must always hold.
For example:
lcp[3][3] = 3
is impossible when only one character remains in the suffix.
The implementation immediately validates all diagonal entries before doing any further processing.
Equality Constraints That Cause Contradictions
It is possible for pairwise equality relationships to appear valid while deeper suffix relationships remain inconsistent.
For example:
lcp[i][j] = 1
means the first characters match, but the next characters must differ. If later matrix entries imply they also match, the matrix becomes contradictory.
The final DP verification step catches all such inconsistencies.
More Than 26 Distinct Components
The problem restricts characters to lowercase English letters.
If the Union-Find structure produces more than 26 disconnected components, no valid string exists.
The greedy assignment step explicitly checks this condition and returns "" when necessary.