LeetCode 3076 - Shortest Uncommon Substring in an Array
The problem asks us to find, for each string in a given array, the shortest substring that does not appear in any other string in the array. If multiple shortest substrings exist, we must choose the lexicographically smallest one.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Trie
Solution
Problem Understanding
The problem asks us to find, for each string in a given array, the shortest substring that does not appear in any other string in the array. If multiple shortest substrings exist, we must choose the lexicographically smallest one. If no such substring exists, the answer for that string is an empty string.
In other words, for an array arr of strings:
- Input:
arr = ["cab", "ad", "bad", "c"] - Output:
["ab", "", "ba", ""]
The output represents, for each string, the minimal unique substring that differentiates it from all other strings.
The constraints tell us that:
ncan be up to 100, so the number of strings is relatively small.- Each string has a maximum length of 20, so enumerating all substrings of a single string is feasible (
O(L^2)substrings per string). - Strings consist only of lowercase letters, so lexicographic comparison is straightforward.
Important edge cases include:
- Strings with complete overlap, e.g.,
["abc","bcd"]where short substrings may appear in multiple strings. - Strings of length 1.
- Strings where all possible substrings occur in other strings.
The problem guarantees non-empty strings, which simplifies substring generation.
Approaches
The brute-force approach is to generate all substrings for each string and check whether each substring appears in any other string. For each string, we would:
- Generate all substrings.
- Check if the substring occurs in any other string.
- Track the shortest substrings that are unique and pick the lexicographically smallest.
This approach is correct but inefficient because the substring search across other strings is O(n * L^3) in the worst case (since checking a substring against a string takes O(L)).
The optimal approach is to use a Trie data structure to track the occurrence of substrings across all strings. By inserting every substring into a Trie along with the set of string indices in which it occurs, we can efficiently query the shortest unique substring for each string:
- Generate all substrings for each string.
- Insert each substring into a Trie while recording the indices of the strings where it appears.
- For each string, traverse the Trie to find the shortest substring that occurs only in this string's index set.
- Among substrings of the same length, choose the lexicographically smallest.
This reduces repeated substring searches and allows us to efficiently find unique substrings.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^2 * L^3) | O(L^2) per string | Generate all substrings and check against other strings |
| Trie-Based Optimal | O(n * L^3) | O(n * L^3) | Insert all substrings into a Trie and track string indices |
Algorithm Walkthrough
- Initialize a Trie where each node stores a dictionary of children and a set of string indices.
- For each string
arr[i]in the array, generate all possible substrings. - Insert each substring into the Trie. While inserting, update the set of indices at the node to include
i. - After building the Trie, for each string
arr[i], traverse all paths from the root and identify nodes where the index set contains onlyi. This identifies substrings unique to this string. - Keep track of the shortest unique substring for each string. If multiple exist of the same length, select the lexicographically smallest.
- If no unique substring exists for a string, return an empty string for that position.
Why it works: By storing the set of string indices at each Trie node, we can immediately identify which substrings are unique. Since all substrings are generated and inserted, the search is exhaustive, and using the shortest-length-first traversal ensures minimal length and lexicographic order.
Python Solution
from typing import List
class TrieNode:
def __init__(self):
self.children = {}
self.indices = set()
class Solution:
def shortestSubstrings(self, arr: List[str]) -> List[str]:
root = TrieNode()
# Insert all substrings into the Trie
for idx, word in enumerate(arr):
L = len(word)
for start in range(L):
node = root
for end in range(start, L):
char = word[end]
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.indices.add(idx)
result = []
# Find shortest unique substring for each word
for idx, word in enumerate(arr):
shortest = None
L = len(word)
for start in range(L):
node = root
substring = ""
for end in range(start, L):
char = word[end]
node = node.children[char]
substring += char
if node.indices == {idx}: # unique substring
if shortest is None or len(substring) < len(shortest) or (len(substring) == len(shortest) and substring < shortest):
shortest = substring
break
result.append(shortest if shortest else "")
return result
Explanation: The TrieNode stores children and the indices of strings passing through. Substrings are inserted into the Trie, and each node collects the indices of strings containing it. While retrieving the shortest unique substring, we traverse the Trie per string and find the first node where the set of indices contains only that string's index.
Go Solution
package main
func shortestSubstrings(arr []string) []string {
type TrieNode struct {
children map[rune]*TrieNode
indices map[int]struct{}
}
root := &TrieNode{children: make(map[rune]*TrieNode), indices: make(map[int]struct{})}
// Insert all substrings
for idx, word := range arr {
for start := 0; start < len(word); start++ {
node := root
for _, char := range word[start:] {
if node.children[char] == nil {
node.children[char] = &TrieNode{children: make(map[rune]*TrieNode), indices: make(map[int]struct{})}
}
node = node.children[char]
node.indices[idx] = struct{}{}
}
}
}
result := make([]string, len(arr))
for idx, word := range arr {
shortest := ""
L := len(word)
for start := 0; start < L; start++ {
node := root
substring := ""
for _, char := range word[start:] {
node = node.children[char]
substring += string(char)
if len(node.indices) == 1 {
for k := range node.indices {
if k == idx {
if shortest == "" || len(substring) < len(shortest) || (len(substring) == len(shortest) && substring < shortest) {
shortest = substring
}
break
}
}
break
}
}
}
result[idx] = shortest
}
return result
}
Explanation: Go uses map[rune]*TrieNode for children and map[int]struct{} for indices. The logic mirrors Python: build the Trie with substring indices and search for the shortest unique substring.
Worked Examples
Example 1: ["cab","ad","bad","c"]
| String | Substrings Checked | Unique Substring | Selected |
|---|---|---|---|
| "cab" | "c","ca","cab","a","ab","b" | "ab","ca" | "ab" |
| "ad" | "a","ad","d" | none | "" |
| "bad" | "b","ba","bad","a","ad","d" | "ba" | "ba" |
| "c" | "c" | none | "" |
Example 2: ["abc","bcd","abcd"]
| String | Substrings Checked | Unique Substring | Selected |
|---|---|---|---|
| "abc" | all | none | "" |
| "bcd" | all | none | "" |
| "abcd" | all | "abcd" | "abcd" |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n * L^3) | Generating all substrings (L^2 per string) and inserting/traversing Trie (up to L per substring) |
| Space | O(n * L^3) | Trie stores all substrings with index sets |
The cubic factor in time comes from generating substrings of length up to L and processing them per string. Space is proportional to the total number of substrings across all strings.
Test Cases
# Provided examples
assert Solution().shortestSubstrings(["cab","ad","bad","c"]) == ["ab","","ba",""] # Example 1
assert Solution().shortestSubstrings(["abc","bcd","abcd"]) == ["","","abcd"] # Example 2
# Edge cases
assert Solution().shortestSubstrings(["a","b"]) == ["a","b"] # Single-letter strings
assert Solution().shortestSubstrings(["aa","aa","aa"]) == ["","",""] # All identical strings
assert Solution