LeetCode 1657 - Determine if Two Strings Are Close
The problem asks whether two strings, word1 and word2, are "close" according to two allowed operations. Operation 1 allows swapping any two existing characters in the string, which means the relative order of characters is flexible.
Difficulty: 🟡 Medium
Topics: Hash Table, String, Sorting, Counting
Solution
Problem Understanding
The problem asks whether two strings, word1 and word2, are "close" according to two allowed operations. Operation 1 allows swapping any two existing characters in the string, which means the relative order of characters is flexible. Operation 2 allows transforming all occurrences of one character into another existing character, and vice versa, which effectively lets you reassign frequencies between characters as long as both characters exist in the string.
The input consists of two strings containing only lowercase English letters, with lengths up to 10^5. The expected output is a boolean value: true if the strings can be transformed into one another using any number of these operations, and false otherwise.
The constraints imply that we cannot simulate all possible sequences of operations, as that would be too slow for long strings. Important edge cases include strings of different lengths, strings with mismatched sets of characters, or strings where characters exist with different frequencies that cannot be matched even after swapping.
The main insight is that the order of characters does not matter due to Operation 1, so only the set of characters and their frequency distributions matter. Operation 2 allows us to permute frequencies between characters, so the frequency counts themselves only need to match as multisets.
Approaches
A brute-force approach would attempt to simulate all possible sequences of operations, swapping characters and transforming frequencies iteratively to check if the strings can become identical. This is theoretically correct, but impractical for input lengths up to 10^5 because the number of possible operations grows factorially with string length.
The optimal approach relies on two key observations: first, the strings must have the same set of unique characters; second, the sorted frequency counts of characters must match. Sorting the frequencies accounts for the fact that Operation 2 allows frequency reassignment between characters. This reduces the problem to simple set and frequency comparison, which is efficient.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n!) | O(n) | Simulate all operations, impractical for large strings |
| Optimal | O(n log n) | O(n) | Compare character sets and sorted frequency counts |
Algorithm Walkthrough
- Check length: If
word1andword2have different lengths, returnfalse. This is because neither operation can change the total number of characters. - Count character frequencies: Build a frequency map for each string. This lets us track how many times each character occurs.
- Compare character sets: Extract the set of unique characters from each string. If these sets differ, return
falsebecause we cannot introduce new characters using the allowed operations. - Compare frequency distributions: Sort the frequency values of each string and compare them. If they match, the strings can be transformed into each other using Operation 2; otherwise, return
false.
Why it works: Operation 1 allows reordering characters arbitrarily, and Operation 2 allows any permutation of frequencies between existing characters. Therefore, two strings are close if they have the same character set and the same multiset of frequencies. This guarantees that any sequence of operations can convert one string into the other.
Python Solution
from collections import Counter
class Solution:
def closeStrings(self, word1: str, word2: str) -> bool:
if len(word1) != len(word2):
return False
freq1 = Counter(word1)
freq2 = Counter(word2)
if set(freq1.keys()) != set(freq2.keys()):
return False
return sorted(freq1.values()) == sorted(freq2.values())
The Python implementation first checks if the strings are the same length. It then uses Counter to count character occurrences. The character sets are compared using set(freq1.keys()) to ensure the same characters exist. Finally, the sorted frequency lists are compared to verify that frequency distributions match, allowing for any swaps or transformations.
Go Solution
import "sort"
func closeStrings(word1 string, word2 string) bool {
if len(word1) != len(word2) {
return false
}
freq1 := make(map[rune]int)
freq2 := make(map[rune]int)
for _, ch := range word1 {
freq1[ch]++
}
for _, ch := range word2 {
freq2[ch]++
}
if len(freq1) != len(freq2) {
return false
}
keys1 := make([]rune, 0, len(freq1))
keys2 := make([]rune, 0, len(freq2))
for k := range freq1 {
keys1 = append(keys1, k)
}
for k := range freq2 {
keys2 = append(keys2, k)
}
keySet1 := make(map[rune]struct{})
keySet2 := make(map[rune]struct{})
for _, k := range keys1 {
keySet1[k] = struct{}{}
}
for _, k := range keys2 {
keySet2[k] = struct{}{}
}
for k := range keySet1 {
if _, ok := keySet2[k]; !ok {
return false
}
}
values1 := make([]int, 0, len(freq1))
values2 := make([]int, 0, len(freq2))
for _, v := range freq1 {
values1 = append(values1, v)
}
for _, v := range freq2 {
values2 = append(values2, v)
}
sort.Ints(values1)
sort.Ints(values2)
for i := range values1 {
if values1[i] != values2[i] {
return false
}
}
return true
}
In Go, the implementation uses maps to count character frequencies. Sets of keys are constructed using empty structs to compare character sets. Frequencies are extracted, sorted, and compared element by element. The Go version is more verbose due to lack of built-in Counter and set operations.
Worked Examples
Example 1: word1 = "abc", word2 = "bca"
| Step | word1 Counter | word2 Counter | Key Set Comparison | Sorted Values |
|---|---|---|---|---|
| Initial | {a:1, b:1, c:1} | {b:1, c:1, a:1} | Equal | [1,1,1] == [1,1,1] |
Result: true
Example 2: word1 = "a", word2 = "aa"
Lengths differ: return false.
Example 3: word1 = "cabbba", word2 = "abbccc"
| Step | word1 Counter | word2 Counter | Key Set Comparison | Sorted Values |
|---|---|---|---|---|
| Initial | {a:2,b:3,c:1} | {a:1,b:2,c:3} | Equal | [1,2,3] == [1,2,3] |
Result: true
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Counting frequencies is O(n), sorting frequency lists is O(k log k) where k <= 26, so overall dominated by O(n) practically, sorting negligible |
| Space | O(n) | Frequency maps store up to n entries, but practically limited to 26 letters |
Because the frequency maps only store lowercase letters, space usage is effectively O(1), but for formal analysis we consider input size.
Test Cases
# provided examples
assert Solution().closeStrings("abc", "bca") == True # rearrangement possible
assert Solution().closeStrings("a", "aa") == False # different lengths
assert Solution().closeStrings("cabbba", "abbccc") == True # frequency transformation
# edge cases
assert Solution().closeStrings("aabbcc", "ccbbaa") == True # reversed order
assert Solution().closeStrings("aabbcc", "aabbcd") == False # different character sets
assert Solution().closeStrings("aaa", "aaa") == True # identical strings
assert Solution().closeStrings("abc", "def") == False # completely different characters
assert Solution().closeStrings("a"*100000, "a"*100000) == True # large identical strings
| Test | Why |
|---|---|
| "abc", "bca" | Tests simple permutation |
| "a", "aa" | Tests length mismatch |
| "cabbba", "abbccc" | Tests frequency swapping |
| "aabbcc", "ccbbaa" | Tests reordering does not affect outcome |
| "aabbcc", "aabbcd" | Tests different character sets |
| "aaa", "aaa" | Tests identical strings |
| "abc", "def" | Tests completely disjoint characters |
| 100000 'a's | Tests performance on large inputs |
Edge Cases
One important edge case is when the two strings have different lengths. Operation 1 and 2 cannot change the total number of characters, so the algorithm immediately returns false. Another edge case is when the strings contain different sets of characters. Even if the frequencies match, if a character is missing from one string, it cannot be created or swapped in, which is why comparing the character sets is critical. A third edge case is when the