LeetCode 1657 - Determine if Two Strings Are Close

Difficulty: 🟡 Medium
Topics: Hash Table, String, Sorting, Counting

Solution

Problem Understanding

The problem asks whether two strings, word1 and word2, are "close" according to two allowed operations. Operation 1 allows swapping any two existing characters in the string, which means the relative order of characters is flexible. Operation 2 allows transforming all occurrences of one character into another existing character, and vice versa, which effectively lets you reassign frequencies between characters as long as both characters exist in the string.

The input consists of two strings containing only lowercase English letters, with lengths up to 10^5. The expected output is a boolean value: true if the strings can be transformed into one another using any number of these operations, and false otherwise.

The constraints imply that we cannot simulate all possible sequences of operations, as that would be too slow for long strings. Important edge cases include strings of different lengths, strings with mismatched sets of characters, or strings where characters exist with different frequencies that cannot be matched even after swapping.

The main insight is that the order of characters does not matter due to Operation 1, so only the set of characters and their frequency distributions matter. Operation 2 allows us to permute frequencies between characters, so the frequency counts themselves only need to match as multisets.

Approaches

A brute-force approach would attempt to simulate all possible sequences of operations, swapping characters and transforming frequencies iteratively to check if the strings can become identical. This is theoretically correct, but impractical for input lengths up to 10^5 because the number of possible operations grows factorially with string length.

The optimal approach relies on two key observations: first, the strings must have the same set of unique characters; second, the sorted frequency counts of characters must match. Sorting the frequencies accounts for the fact that Operation 2 allows frequency reassignment between characters. This reduces the problem to simple set and frequency comparison, which is efficient.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n!)	O(n)	Simulate all operations, impractical for large strings
Optimal	O(n log n)	O(n)	Compare character sets and sorted frequency counts

Algorithm Walkthrough

Check length: If word1 and word2 have different lengths, return false. This is because neither operation can change the total number of characters.
Count character frequencies: Build a frequency map for each string. This lets us track how many times each character occurs.
Compare character sets: Extract the set of unique characters from each string. If these sets differ, return false because we cannot introduce new characters using the allowed operations.
Compare frequency distributions: Sort the frequency values of each string and compare them. If they match, the strings can be transformed into each other using Operation 2; otherwise, return false.

Why it works: Operation 1 allows reordering characters arbitrarily, and Operation 2 allows any permutation of frequencies between existing characters. Therefore, two strings are close if they have the same character set and the same multiset of frequencies. This guarantees that any sequence of operations can convert one string into the other.

Python Solution

from collections import Counter

class Solution:
    def closeStrings(self, word1: str, word2: str) -> bool:
        if len(word1) != len(word2):
            return False
        
        freq1 = Counter(word1)
        freq2 = Counter(word2)
        
        if set(freq1.keys()) != set(freq2.keys()):
            return False
        
        return sorted(freq1.values()) == sorted(freq2.values())

The Python implementation first checks if the strings are the same length. It then uses Counter to count character occurrences. The character sets are compared using set(freq1.keys()) to ensure the same characters exist. Finally, the sorted frequency lists are compared to verify that frequency distributions match, allowing for any swaps or transformations.

Go Solution

import "sort"

func closeStrings(word1 string, word2 string) bool {
    if len(word1) != len(word2) {
        return false
    }
    
    freq1 := make(map[rune]int)
    freq2 := make(map[rune]int)
    
    for _, ch := range word1 {
        freq1[ch]++
    }
    for _, ch := range word2 {
        freq2[ch]++
    }
    
    if len(freq1) != len(freq2) {
        return false
    }
    
    keys1 := make([]rune, 0, len(freq1))
    keys2 := make([]rune, 0, len(freq2))
    
    for k := range freq1 {
        keys1 = append(keys1, k)
    }
    for k := range freq2 {
        keys2 = append(keys2, k)
    }
    
    keySet1 := make(map[rune]struct{})
    keySet2 := make(map[rune]struct{})
    
    for _, k := range keys1 {
        keySet1[k] = struct{}{}
    }
    for _, k := range keys2 {
        keySet2[k] = struct{}{}
    }
    
    for k := range keySet1 {
        if _, ok := keySet2[k]; !ok {
            return false
        }
    }
    
    values1 := make([]int, 0, len(freq1))
    values2 := make([]int, 0, len(freq2))
    
    for _, v := range freq1 {
        values1 = append(values1, v)
    }
    for _, v := range freq2 {
        values2 = append(values2, v)
    }
    
    sort.Ints(values1)
    sort.Ints(values2)
    
    for i := range values1 {
        if values1[i] != values2[i] {
            return false
        }
    }
    
    return true
}

In Go, the implementation uses maps to count character frequencies. Sets of keys are constructed using empty structs to compare character sets. Frequencies are extracted, sorted, and compared element by element. The Go version is more verbose due to lack of built-in Counter and set operations.

Worked Examples

Example 1: word1 = "abc", word2 = "bca"

Step	word1 Counter	word2 Counter	Key Set Comparison	Sorted Values
Initial	{a:1, b:1, c:1}	{b:1, c:1, a:1}	Equal	[1,1,1] == [1,1,1]

Result: true

Example 2: word1 = "a", word2 = "aa"

Lengths differ: return false.

Example 3: word1 = "cabbba", word2 = "abbccc"

Step	word1 Counter	word2 Counter	Key Set Comparison	Sorted Values
Initial	{a:2,b:3,c:1}	{a:1,b:2,c:3}	Equal	[1,2,3] == [1,2,3]

Result: true

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n log n)	Counting frequencies is O(n), sorting frequency lists is O(k log k) where k <= 26, so overall dominated by O(n) practically, sorting negligible
Space	O(n)	Frequency maps store up to n entries, but practically limited to 26 letters

Because the frequency maps only store lowercase letters, space usage is effectively O(1), but for formal analysis we consider input size.

Test Cases

# provided examples
assert Solution().closeStrings("abc", "bca") == True  # rearrangement possible
assert Solution().closeStrings("a", "aa") == False  # different lengths
assert Solution().closeStrings("cabbba", "abbccc") == True  # frequency transformation

# edge cases
assert Solution().closeStrings("aabbcc", "ccbbaa") == True  # reversed order
assert Solution().closeStrings("aabbcc", "aabbcd") == False  # different character sets
assert Solution().closeStrings("aaa", "aaa") == True  # identical strings
assert Solution().closeStrings("abc", "def") == False  # completely different characters
assert Solution().closeStrings("a"*100000, "a"*100000) == True  # large identical strings

Test	Why
"abc", "bca"	Tests simple permutation
"a", "aa"	Tests length mismatch
"cabbba", "abbccc"	Tests frequency swapping
"aabbcc", "ccbbaa"	Tests reordering does not affect outcome
"aabbcc", "aabbcd"	Tests different character sets
"aaa", "aaa"	Tests identical strings
"abc", "def"	Tests completely disjoint characters
100000 'a's	Tests performance on large inputs

Edge Cases

One important edge case is when the two strings have different lengths. Operation 1 and 2 cannot change the total number of characters, so the algorithm immediately returns false. Another edge case is when the strings contain different sets of characters. Even if the frequencies match, if a character is missing from one string, it cannot be created or swapped in, which is why comparing the character sets is critical. A third edge case is when the