LeetCode 1239 - Maximum Length of a Concatenated String with Unique Characters
This problem asks us to select a subsequence of strings from the input array arr and concatenate them together such that the resulting string contains only unique characters. Among all valid concatenations, we must return the maximum possible length.
Difficulty: 🟡 Medium
Topics: Array, String, Backtracking, Bit Manipulation
Solution
Problem Understanding
This problem asks us to select a subsequence of strings from the input array arr and concatenate them together such that the resulting string contains only unique characters. Among all valid concatenations, we must return the maximum possible length.
A subsequence means we can either include or skip each string while preserving the original order. We are not allowed to rearrange strings or characters. The important condition is that every character in the final concatenated string must appear exactly once.
For example, if we concatenate "un" and "iq", we get "uniq", which contains the characters u, n, i, and q, all unique. However, concatenating "ab" and "bc" would produce "abbc", which repeats the character b, making it invalid.
The input size is relatively small:
arr.length <= 16- Each string length is at most
26 - Only lowercase English letters are used
These constraints strongly suggest that exponential exploration is acceptable because there are at most 2^16 = 65,536 subsequences. That is small enough for a carefully implemented backtracking or bitmasking solution.
Several edge cases are important:
- A string may already contain duplicate characters internally, such as
"aa". Such strings can never be part of a valid answer. - Multiple strings may overlap in characters, preventing them from being combined.
- The optimal answer may consist of only one string.
- The empty subsequence is always valid and has length
0. - Since only lowercase English letters exist, we can efficiently represent character sets using a 26-bit integer mask.
Approaches
Brute Force Approach
The most straightforward solution is to generate every possible subsequence of arr. For each subsequence, we concatenate all selected strings and check whether the final result contains unique characters.
The brute force process works as follows:
- Generate all
2^nsubsequences. - Concatenate the selected strings.
- Check whether the resulting string has duplicate characters.
- Track the maximum valid length.
This approach is correct because it examines every possible combination, guaranteeing that the optimal answer will be found.
However, repeatedly concatenating strings and checking uniqueness is inefficient. Every validation step may require scanning many characters repeatedly. While the constraints are small enough that brute force could pass, it performs unnecessary work.
Key Insight
The main optimization comes from representing character sets with bitmasks.
Since there are only 26 lowercase letters, we can use a 26-bit integer where each bit represents whether a character exists in the current concatenation.
For example:
- Bit
0represents'a' - Bit
1represents'b' - Bit
25represents'z'
This gives two major benefits:
- We can quickly determine whether a string contains duplicate characters internally.
- We can check whether two strings overlap using a single bitwise AND operation.
If two masks have no overlapping bits:
mask1 & mask2 == 0
then the strings can be safely combined.
This transforms repeated character scanning into constant-time bit operations, making backtracking much more efficient.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(2^n × n × 26) | O(n × 26) | Generates every subsequence and repeatedly validates strings |
| Optimal | O(2^n) | O(2^n) | Uses bitmask backtracking for fast overlap checks |
Algorithm Walkthrough
Optimal Backtracking with Bitmasking
- First, preprocess every string in
arrinto a bitmask.
For each character in a string, compute its corresponding bit position and set that bit in the mask.
If we encounter a character that is already set in the same string, then the string itself contains duplicates and can never be used. We discard it immediately. 2. Store all valid string masks in a list.
Each mask represents the set of characters used by that string. 3. Start backtracking through the list of masks.
At each position, we have two choices:
- Skip the current string
- Include the current string if it does not overlap with the existing characters
- Maintain a
current_maskrepresenting all characters currently used.
To check whether a new string can be added:
current_mask & next_mask == 0
If this condition is true, the strings do not share characters.
- When adding a string, combine masks using bitwise OR:
new_mask = current_mask | next_mask
- At every recursive call, update the maximum answer using the number of set bits in the current mask.
- Continue exploring all valid combinations until all masks are processed.
Why it works
The algorithm explores every valid subsequence of strings. The bitmask representation guarantees that no duplicate characters exist in the current concatenation because overlapping masks are never combined.
Since every valid combination is considered exactly once and invalid combinations are pruned immediately, the algorithm always finds the maximum possible length.
Python Solution
from typing import List
class Solution:
def maxLength(self, arr: List[str]) -> int:
valid_masks = []
# Convert each string into a bitmask
for word in arr:
mask = 0
valid = True
for ch in word:
bit = 1 << (ord(ch) - ord('a'))
# Duplicate character inside the same word
if mask & bit:
valid = False
break
mask |= bit
if valid:
valid_masks.append(mask)
self.answer = 0
def backtrack(index: int, current_mask: int) -> None:
self.answer = max(self.answer, current_mask.bit_count())
for i in range(index, len(valid_masks)):
next_mask = valid_masks[i]
# Skip if characters overlap
if current_mask & next_mask:
continue
backtrack(i + 1, current_mask | next_mask)
backtrack(0, 0)
return self.answer
The implementation begins by preprocessing each string into a bitmask. During this step, strings containing duplicate characters are discarded because they can never contribute to a valid solution.
The valid_masks list stores only usable strings.
The recursive backtrack function explores all valid subsequences. The current_mask parameter tracks which characters are already used.
At every recursive call, we update the answer using:
current_mask.bit_count()
This efficiently counts the number of unique characters currently selected.
Before adding a new string, we check for overlap:
if current_mask & next_mask:
If overlap exists, adding that string would violate the uniqueness condition, so we skip it.
Otherwise, we combine masks using bitwise OR and continue recursion.
Go Solution
func maxLength(arr []string) int {
validMasks := []int{}
// Convert strings into bitmasks
for _, word := range arr {
mask := 0
valid := true
for _, ch := range word {
bit := 1 << (ch - 'a')
// Duplicate character inside the same word
if mask&bit != 0 {
valid = false
break
}
mask |= bit
}
if valid {
validMasks = append(validMasks, mask)
}
}
answer := 0
var backtrack func(index int, currentMask int)
backtrack = func(index int, currentMask int) {
length := 0
// Count set bits
temp := currentMask
for temp > 0 {
length += temp & 1
temp >>= 1
}
if length > answer {
answer = length
}
for i := index; i < len(validMasks); i++ {
nextMask := validMasks[i]
// Skip overlapping masks
if currentMask&nextMask != 0 {
continue
}
backtrack(i+1, currentMask|nextMask)
}
}
backtrack(0, 0)
return answer
}
The Go implementation follows the same overall strategy as the Python version. Since Go does not provide a built-in bit_count() method for integers in the same way Python does, we manually count set bits using a loop.
Slices are used to store valid masks, and recursion is implemented using a closure assigned to backtrack.
No special handling for nil versus empty slices is required because iterating over an empty slice is naturally safe in Go.
Worked Examples
Example 1
arr = ["un","iq","ue"]
Step 1: Convert to Bitmasks
| String | Characters | Bitmask |
|---|---|---|
| "un" | u, n | valid |
| "iq" | i, q | valid |
| "ue" | u, e | valid |
Step 2: Backtracking
| Current Combination | Current Mask Characters | Length |
|---|---|---|
| "" | {} | 0 |
| "un" | {u,n} | 2 |
| "uniq" | {u,n,i,q} | 4 |
| "ue" | {u,e} | 2 |
| "iq" | {i,q} | 2 |
| "ique" | {i,q,u,e} | 4 |
Maximum length becomes 4.
Example 2
arr = ["cha","r","act","ers"]
Valid Combinations
| Combination | Valid | Length |
|---|---|---|
| "cha" | Yes | 3 |
| "char" | Yes | 4 |
| "chaers" | Yes | 6 |
| "acters" | Yes | 6 |
| "chaact" | No, repeated a/c | Invalid |
Maximum length is 6.
Example 3
arr = ["abcdefghijklmnopqrstuvwxyz"]
Processing
The single string already contains all 26 unique letters.
| Combination | Length |
|---|---|
| "abcdefghijklmnopqrstuvwxyz" | 26 |
Answer is 26.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(2^n) | Each valid subset is explored once |
| Space | O(2^n) | Recursive exploration and stored masks |
The preprocessing step takes at most O(n × 26) because each string length is at most 26.
The dominant factor is the recursive exploration of subsets. Since each string can either be included or excluded, the number of combinations is bounded by 2^n.
Because n <= 16, this is efficient enough.
Test Cases
from typing import List
def test():
solution = Solution()
assert solution.maxLength(["un", "iq", "ue"]) == 4
# Basic example with multiple valid combinations
assert solution.maxLength(["cha", "r", "act", "ers"]) == 6
# Requires choosing the optimal subset
assert solution.maxLength(["abcdefghijklmnopqrstuvwxyz"]) == 26
# Entire alphabet in one string
assert solution.maxLength(["aa", "bb"]) == 0
# All strings invalid due to internal duplicates
assert solution.maxLength(["ab", "cd", "ef"]) == 6
# All strings can be combined
assert solution.maxLength(["ab", "bc", "cd"]) == 4
# Overlapping characters prevent full combination
assert solution.maxLength(["a", "abc", "d", "de", "def"]) == 6
# Multiple branching possibilities
assert solution.maxLength([""]) == 0
# Empty string only
assert solution.maxLength(["a"]) == 1
# Single-character string
assert solution.maxLength(["abc", "ade", "xyz"]) == 6
# Must avoid overlapping characters
assert solution.maxLength(["jnfbyktlrqumowxd", "mvhgcpxnjzrdei"]) == 16
# Large overlapping strings
test()
| Test | Why |
|---|---|
["un","iq","ue"] |
Basic example with branching |
["cha","r","act","ers"] |
Optimal subset selection |
["abcdefghijklmnopqrstuvwxyz"] |
Maximum possible answer |
["aa","bb"] |
Invalid strings with duplicates |
["ab","cd","ef"] |
All strings combine successfully |
["ab","bc","cd"] |
Overlapping character conflicts |
["a","abc","d","de","def"] |
Complex recursion paths |
[""] |
Empty string handling |
["a"] |
Minimum non-empty input |
["abc","ade","xyz"] |
Partial overlap behavior |
| Large overlapping strings | Stress test for bitmask logic |
Edge Cases
One important edge case occurs when a string contains duplicate characters internally, such as "aa" or "abca". A naive implementation might still attempt to use these strings during recursion, creating invalid concatenations. The preprocessing phase eliminates such strings immediately by detecting repeated bits while building the mask.
Another important case happens when every string overlaps with every other string. For example:
["ab", "bc", "cd"]
In this situation, the algorithm must correctly determine that combining all strings is impossible. The bitmask overlap check ensures that invalid combinations are skipped before recursion continues.
A third edge case involves the empty subsequence. If every string is invalid, such as:
["aa", "bb"]
the correct answer is 0, representing the empty concatenation. Since the recursion starts with an empty mask and only updates the answer using valid combinations, the implementation naturally handles this case correctly.
Another subtle edge case occurs when the optimal solution is not formed by the longest individual string. For example:
["abc", "de", "fgh"]
The best answer is "defgh" with length 5, not "abc" with length 3. The backtracking approach explores all valid combinations, ensuring that locally optimal choices do not incorrectly terminate the search.