LeetCode 1239 - Maximum Length of a Concatenated String with Unique Characters

This problem asks us to select a subsequence of strings from the input array arr and concatenate them together such that the resulting string contains only unique characters. Among all valid concatenations, we must return the maximum possible length.

LeetCode Problem 1239

Difficulty: 🟡 Medium
Topics: Array, String, Backtracking, Bit Manipulation

Solution

Problem Understanding

This problem asks us to select a subsequence of strings from the input array arr and concatenate them together such that the resulting string contains only unique characters. Among all valid concatenations, we must return the maximum possible length.

A subsequence means we can either include or skip each string while preserving the original order. We are not allowed to rearrange strings or characters. The important condition is that every character in the final concatenated string must appear exactly once.

For example, if we concatenate "un" and "iq", we get "uniq", which contains the characters u, n, i, and q, all unique. However, concatenating "ab" and "bc" would produce "abbc", which repeats the character b, making it invalid.

The input size is relatively small:

  • arr.length <= 16
  • Each string length is at most 26
  • Only lowercase English letters are used

These constraints strongly suggest that exponential exploration is acceptable because there are at most 2^16 = 65,536 subsequences. That is small enough for a carefully implemented backtracking or bitmasking solution.

Several edge cases are important:

  • A string may already contain duplicate characters internally, such as "aa". Such strings can never be part of a valid answer.
  • Multiple strings may overlap in characters, preventing them from being combined.
  • The optimal answer may consist of only one string.
  • The empty subsequence is always valid and has length 0.
  • Since only lowercase English letters exist, we can efficiently represent character sets using a 26-bit integer mask.

Approaches

Brute Force Approach

The most straightforward solution is to generate every possible subsequence of arr. For each subsequence, we concatenate all selected strings and check whether the final result contains unique characters.

The brute force process works as follows:

  1. Generate all 2^n subsequences.
  2. Concatenate the selected strings.
  3. Check whether the resulting string has duplicate characters.
  4. Track the maximum valid length.

This approach is correct because it examines every possible combination, guaranteeing that the optimal answer will be found.

However, repeatedly concatenating strings and checking uniqueness is inefficient. Every validation step may require scanning many characters repeatedly. While the constraints are small enough that brute force could pass, it performs unnecessary work.

Key Insight

The main optimization comes from representing character sets with bitmasks.

Since there are only 26 lowercase letters, we can use a 26-bit integer where each bit represents whether a character exists in the current concatenation.

For example:

  • Bit 0 represents 'a'
  • Bit 1 represents 'b'
  • Bit 25 represents 'z'

This gives two major benefits:

  1. We can quickly determine whether a string contains duplicate characters internally.
  2. We can check whether two strings overlap using a single bitwise AND operation.

If two masks have no overlapping bits:

mask1 & mask2 == 0

then the strings can be safely combined.

This transforms repeated character scanning into constant-time bit operations, making backtracking much more efficient.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^n × n × 26) O(n × 26) Generates every subsequence and repeatedly validates strings
Optimal O(2^n) O(2^n) Uses bitmask backtracking for fast overlap checks

Algorithm Walkthrough

Optimal Backtracking with Bitmasking

  1. First, preprocess every string in arr into a bitmask.

For each character in a string, compute its corresponding bit position and set that bit in the mask.

If we encounter a character that is already set in the same string, then the string itself contains duplicates and can never be used. We discard it immediately. 2. Store all valid string masks in a list.

Each mask represents the set of characters used by that string. 3. Start backtracking through the list of masks.

At each position, we have two choices:

  • Skip the current string
  • Include the current string if it does not overlap with the existing characters
  1. Maintain a current_mask representing all characters currently used.

To check whether a new string can be added:

current_mask & next_mask == 0

If this condition is true, the strings do not share characters.

  1. When adding a string, combine masks using bitwise OR:
new_mask = current_mask | next_mask
  1. At every recursive call, update the maximum answer using the number of set bits in the current mask.
  2. Continue exploring all valid combinations until all masks are processed.

Why it works

The algorithm explores every valid subsequence of strings. The bitmask representation guarantees that no duplicate characters exist in the current concatenation because overlapping masks are never combined.

Since every valid combination is considered exactly once and invalid combinations are pruned immediately, the algorithm always finds the maximum possible length.

Python Solution

from typing import List

class Solution:
    def maxLength(self, arr: List[str]) -> int:
        valid_masks = []

        # Convert each string into a bitmask
        for word in arr:
            mask = 0
            valid = True

            for ch in word:
                bit = 1 << (ord(ch) - ord('a'))

                # Duplicate character inside the same word
                if mask & bit:
                    valid = False
                    break

                mask |= bit

            if valid:
                valid_masks.append(mask)

        self.answer = 0

        def backtrack(index: int, current_mask: int) -> None:
            self.answer = max(self.answer, current_mask.bit_count())

            for i in range(index, len(valid_masks)):
                next_mask = valid_masks[i]

                # Skip if characters overlap
                if current_mask & next_mask:
                    continue

                backtrack(i + 1, current_mask | next_mask)

        backtrack(0, 0)

        return self.answer

The implementation begins by preprocessing each string into a bitmask. During this step, strings containing duplicate characters are discarded because they can never contribute to a valid solution.

The valid_masks list stores only usable strings.

The recursive backtrack function explores all valid subsequences. The current_mask parameter tracks which characters are already used.

At every recursive call, we update the answer using:

current_mask.bit_count()

This efficiently counts the number of unique characters currently selected.

Before adding a new string, we check for overlap:

if current_mask & next_mask:

If overlap exists, adding that string would violate the uniqueness condition, so we skip it.

Otherwise, we combine masks using bitwise OR and continue recursion.

Go Solution

func maxLength(arr []string) int {
	validMasks := []int{}

	// Convert strings into bitmasks
	for _, word := range arr {
		mask := 0
		valid := true

		for _, ch := range word {
			bit := 1 << (ch - 'a')

			// Duplicate character inside the same word
			if mask&bit != 0 {
				valid = false
				break
			}

			mask |= bit
		}

		if valid {
			validMasks = append(validMasks, mask)
		}
	}

	answer := 0

	var backtrack func(index int, currentMask int)

	backtrack = func(index int, currentMask int) {
		length := 0

		// Count set bits
		temp := currentMask
		for temp > 0 {
			length += temp & 1
			temp >>= 1
		}

		if length > answer {
			answer = length
		}

		for i := index; i < len(validMasks); i++ {
			nextMask := validMasks[i]

			// Skip overlapping masks
			if currentMask&nextMask != 0 {
				continue
			}

			backtrack(i+1, currentMask|nextMask)
		}
	}

	backtrack(0, 0)

	return answer
}

The Go implementation follows the same overall strategy as the Python version. Since Go does not provide a built-in bit_count() method for integers in the same way Python does, we manually count set bits using a loop.

Slices are used to store valid masks, and recursion is implemented using a closure assigned to backtrack.

No special handling for nil versus empty slices is required because iterating over an empty slice is naturally safe in Go.

Worked Examples

Example 1

arr = ["un","iq","ue"]

Step 1: Convert to Bitmasks

String Characters Bitmask
"un" u, n valid
"iq" i, q valid
"ue" u, e valid

Step 2: Backtracking

Current Combination Current Mask Characters Length
"" {} 0
"un" {u,n} 2
"uniq" {u,n,i,q} 4
"ue" {u,e} 2
"iq" {i,q} 2
"ique" {i,q,u,e} 4

Maximum length becomes 4.

Example 2

arr = ["cha","r","act","ers"]

Valid Combinations

Combination Valid Length
"cha" Yes 3
"char" Yes 4
"chaers" Yes 6
"acters" Yes 6
"chaact" No, repeated a/c Invalid

Maximum length is 6.

Example 3

arr = ["abcdefghijklmnopqrstuvwxyz"]

Processing

The single string already contains all 26 unique letters.

Combination Length
"abcdefghijklmnopqrstuvwxyz" 26

Answer is 26.

Complexity Analysis

Measure Complexity Explanation
Time O(2^n) Each valid subset is explored once
Space O(2^n) Recursive exploration and stored masks

The preprocessing step takes at most O(n × 26) because each string length is at most 26.

The dominant factor is the recursive exploration of subsets. Since each string can either be included or excluded, the number of combinations is bounded by 2^n.

Because n <= 16, this is efficient enough.

Test Cases

from typing import List

def test():
    solution = Solution()

    assert solution.maxLength(["un", "iq", "ue"]) == 4
    # Basic example with multiple valid combinations

    assert solution.maxLength(["cha", "r", "act", "ers"]) == 6
    # Requires choosing the optimal subset

    assert solution.maxLength(["abcdefghijklmnopqrstuvwxyz"]) == 26
    # Entire alphabet in one string

    assert solution.maxLength(["aa", "bb"]) == 0
    # All strings invalid due to internal duplicates

    assert solution.maxLength(["ab", "cd", "ef"]) == 6
    # All strings can be combined

    assert solution.maxLength(["ab", "bc", "cd"]) == 4
    # Overlapping characters prevent full combination

    assert solution.maxLength(["a", "abc", "d", "de", "def"]) == 6
    # Multiple branching possibilities

    assert solution.maxLength([""]) == 0
    # Empty string only

    assert solution.maxLength(["a"]) == 1
    # Single-character string

    assert solution.maxLength(["abc", "ade", "xyz"]) == 6
    # Must avoid overlapping characters

    assert solution.maxLength(["jnfbyktlrqumowxd", "mvhgcpxnjzrdei"]) == 16
    # Large overlapping strings

test()
Test Why
["un","iq","ue"] Basic example with branching
["cha","r","act","ers"] Optimal subset selection
["abcdefghijklmnopqrstuvwxyz"] Maximum possible answer
["aa","bb"] Invalid strings with duplicates
["ab","cd","ef"] All strings combine successfully
["ab","bc","cd"] Overlapping character conflicts
["a","abc","d","de","def"] Complex recursion paths
[""] Empty string handling
["a"] Minimum non-empty input
["abc","ade","xyz"] Partial overlap behavior
Large overlapping strings Stress test for bitmask logic

Edge Cases

One important edge case occurs when a string contains duplicate characters internally, such as "aa" or "abca". A naive implementation might still attempt to use these strings during recursion, creating invalid concatenations. The preprocessing phase eliminates such strings immediately by detecting repeated bits while building the mask.

Another important case happens when every string overlaps with every other string. For example:

["ab", "bc", "cd"]

In this situation, the algorithm must correctly determine that combining all strings is impossible. The bitmask overlap check ensures that invalid combinations are skipped before recursion continues.

A third edge case involves the empty subsequence. If every string is invalid, such as:

["aa", "bb"]

the correct answer is 0, representing the empty concatenation. Since the recursion starts with an empty mask and only updates the answer using valid combinations, the implementation naturally handles this case correctly.

Another subtle edge case occurs when the optimal solution is not formed by the longest individual string. For example:

["abc", "de", "fgh"]

The best answer is "defgh" with length 5, not "abc" with length 3. The backtracking approach explores all valid combinations, ensuring that locally optimal choices do not incorrectly terminate the search.