LeetCode 249 - Group Shifted Strings
The problem asks us to group strings that belong to the same cyclic shifting sequence. Two strings belong to the same group if one can be transformed into the other by repeatedly shifting every character forward or backward in the alphabet, with wraparound between 'z' and 'a'.
Difficulty: š” Medium
Topics: Array, Hash Table, String
Solution
Problem Understanding
The problem asks us to group strings that belong to the same cyclic shifting sequence. Two strings belong to the same group if one can be transformed into the other by repeatedly shifting every character forward or backward in the alphabet, with wraparound between 'z' and 'a'.
For example, "abc" becomes "bcd" after one right shift, and "xyz" also becomes "yza" after one right shift. Because shifting preserves the relative distance between characters, "abc", "bcd", and "xyz" all belong to the same sequence.
The input is an array of lowercase strings. The output should be a list of groups, where every string inside a group belongs to the same shifting sequence. The order of the groups does not matter, and the order of strings within each group also does not matter.
The constraints are relatively small:
- At most 200 strings
- Each string has length at most 50
These limits mean even moderately expensive operations are acceptable, but we still want a clean and scalable solution.
The most important observation is that shifting preserves the pattern of differences between adjacent characters. For example:
"abc"has differences[1, 1]"bcd"also has differences[1, 1]"xyz"also has differences[1, 1]
Therefore, all three belong to the same group.
There are several important edge cases:
- Single-character strings like
"a"and"z"should always belong to the same group because any single character can shift into any other single character. - Wraparound cases such as
"az"and"ba"are valid matches because shifting'z'forward produces'a'. - Strings of different lengths can never belong to the same group because shifting does not change length.
- A naive implementation can easily fail on wraparound arithmetic if modulo handling is incorrect.
Approaches
Brute Force Approach
A brute-force solution would compare every pair of strings and determine whether they belong to the same shifting sequence.
To compare two strings, we would:
- Check whether they have the same length.
- Compute the character difference between corresponding letters.
- Verify that every position has the same cyclic difference.
For example:
"abc"to"bcd"gives shifts[1,1,1]"abc"to"xyz"gives shifts[23,23,23]
If the shift amount is consistent across all characters, the strings belong to the same group.
We could repeatedly scan all strings and merge compatible ones together.
This approach is correct because it explicitly checks the defining property of shifted strings. However, it becomes inefficient because every string may need to be compared with every other string.
Key Insight for the Optimal Approach
The important observation is that strings in the same shifting sequence share the same normalized difference pattern.
Instead of comparing strings pairwise, we can compute a canonical representation for each string.
For example:
"abc"ā differences(1,1)"bcd"ā differences(1,1)"xyz"ā differences(1,1)
Similarly:
"az"ā(25)"ba"ā(25)
We can use this difference pattern as a hash map key. Every string with the same key belongs in the same group.
This converts the problem from repeated comparisons into a straightforward grouping problem using a hash table.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n² à m) | O(n) | Compares every pair of strings |
| Optimal | O(n Ć m) | O(n Ć m) | Uses normalized difference pattern as hash key |
Here:
nis the number of stringsmis the maximum string length
Algorithm Walkthrough
Optimal Algorithm
- Create a hash map where:
- The key is a normalized shift pattern
- The value is a list of strings belonging to that pattern
- For each string, compute its signature.
The signature is formed by calculating the cyclic difference between consecutive characters.
For example:
-
"abc" -
'b' - 'a' = 1 -
'c' - 'b' = 1 -
Signature:
(1,1) -
"xyz" -
'y' - 'x' = 1 -
'z' - 'y' = 1 -
Signature:
(1,1)
- Use modulo arithmetic to handle wraparound cases.
For example:
-
"az" -
'z' - 'a' = 25 -
Signature:
(25) -
"ba" -
'a' - 'b' = -1 -
(-1 + 26) % 26 = 25 -
Signature:
(25)
- Insert the string into the hash map bucket corresponding to its signature.
- After processing all strings, return all hash map values.
Why it works
The algorithm works because shifting every character in a string by the same amount preserves the relative differences between adjacent characters. Therefore, all strings in the same shifting sequence produce identical difference signatures. Conversely, strings with different signatures cannot belong to the same shifting sequence.
Python Solution
from collections import defaultdict
from typing import List, Tuple
class Solution:
def groupStrings(self, strings: List[str]) -> List[List[str]]:
def get_signature(s: str) -> Tuple[int, ...]:
signature = []
for i in range(1, len(s)):
diff = (ord(s[i]) - ord(s[i - 1])) % 26
signature.append(diff)
return tuple(signature)
groups = defaultdict(list)
for s in strings:
key = get_signature(s)
groups[key].append(s)
return list(groups.values())
The implementation begins by defining a helper function called get_signature. This function computes the normalized difference pattern for a string.
For each adjacent pair of characters, we calculate the cyclic difference using modulo 26 arithmetic. Using modulo ensures wraparound cases work correctly. For example, moving from 'z' to 'a' correctly produces a difference of 1.
The signature is stored as a tuple because tuples are immutable and can be used as dictionary keys.
We then create a defaultdict(list) to group strings sharing the same signature.
As we iterate through every string:
- Compute its signature
- Insert the string into the corresponding hash map bucket
Finally, we return all grouped values from the dictionary.
Single-character strings naturally produce an empty tuple (), which correctly groups all one-letter strings together.
Go Solution
package main
func groupStrings(strings []string) [][]string {
groups := make(map[string][]string)
for _, s := range strings {
key := getSignature(s)
groups[key] = append(groups[key], s)
}
result := [][]string{}
for _, group := range groups {
result = append(result, group)
}
return result
}
func getSignature(s string) string {
if len(s) == 1 {
return ""
}
signature := make([]byte, 0)
for i := 1; i < len(s); i++ {
diff := (int(s[i]) - int(s[i-1]) + 26) % 26
signature = append(signature, byte(diff)+'a')
}
return string(signature)
}
The Go solution follows the same overall strategy as the Python version but differs slightly in implementation details.
Since Go slices cannot directly serve as map keys, we convert the signature into a string representation. Each cyclic difference is encoded as a character.
For example:
- Difference
0becomes'a' - Difference
1becomes'b'
This creates a compact and hashable signature string.
Go maps automatically return zero values for missing keys, so appending to a nil slice works naturally without extra initialization.
There are no integer overflow concerns because all arithmetic stays within small alphabet-size bounds.
Worked Examples
Example 1
Input:
["abc","bcd","acef","xyz","az","ba","a","z"]
We compute signatures one string at a time.
| String | Differences | Signature | Group State |
|---|---|---|---|
"abc" |
(1,1) |
(1,1) |
{(1,1): ["abc"]} |
"bcd" |
(1,1) |
(1,1) |
{(1,1): ["abc","bcd"]} |
"acef" |
(2,2,1) |
(2,2,1) |
new group |
"xyz" |
(1,1) |
(1,1) |
{(1,1): ["abc","bcd","xyz"]} |
"az" |
(25) |
(25) |
new group |
"ba" |
(25) |
(25) |
{(25): ["az","ba"]} |
"a" |
() |
() |
new group |
"z" |
() |
() |
{(): ["a","z"]} |
Final grouping:
[
["abc","bcd","xyz"],
["acef"],
["az","ba"],
["a","z"]
]
Example 2
Input:
["a"]
Processing:
| String | Signature | Group |
|---|---|---|
"a" |
() |
{(): ["a"]} |
Output:
[["a"]]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n Ć m) | Each string is scanned once |
| Space | O(n Ć m) | Hash map stores all strings and signatures |
The algorithm processes every character of every string exactly once while building signatures. Hash map insertions are average-case O(1), so the total running time is linear in the total input size.
The space complexity comes from storing signatures and grouped strings in the dictionary.
Test Cases
from typing import List
class Solution:
def groupStrings(self, strings: List[str]) -> List[List[str]]:
from collections import defaultdict
def get_signature(s):
signature = []
for i in range(1, len(s)):
diff = (ord(s[i]) - ord(s[i - 1])) % 26
signature.append(diff)
return tuple(signature)
groups = defaultdict(list)
for s in strings:
groups[get_signature(s)].append(s)
return list(groups.values())
def normalize(result):
return sorted([sorted(group) for group in result])
sol = Solution()
# Provided example
assert normalize(
sol.groupStrings(["abc","bcd","acef","xyz","az","ba","a","z"])
) == normalize([
["abc","bcd","xyz"],
["acef"],
["az","ba"],
["a","z"]
]) # standard mixed example
# Single string
assert normalize(
sol.groupStrings(["a"])
) == normalize([["a"]]) # minimum input
# Wraparound shift
assert normalize(
sol.groupStrings(["az","ba"])
) == normalize([["az","ba"]]) # verifies modulo arithmetic
# Multiple independent groups
assert normalize(
sol.groupStrings(["abc","def","ghi","az","yx"])
) == normalize([
["abc","def","ghi"],
["az","yx"]
]) # multiple valid groups
# All single-character strings
assert normalize(
sol.groupStrings(["a","b","c","z"])
) == normalize([
["a","b","c","z"]
]) # all single letters belong together
# Different lengths
assert normalize(
sol.groupStrings(["ab","abc","de","fgh"])
) == normalize([
["ab","de"],
["abc","fgh"]
]) # length-sensitive grouping
# Duplicate strings
assert normalize(
sol.groupStrings(["abc","abc","bcd"])
) == normalize([
["abc","abc","bcd"]
]) # duplicate handling
# No matching groups
assert normalize(
sol.groupStrings(["ab","cd","ace"])
) == normalize([
["ab","cd"],
["ace"]
]) # isolated string
| Test | Why |
|---|---|
| Standard mixed example | Verifies normal grouping behavior |
| Single string | Tests minimum input size |
| Wraparound shift | Ensures modulo arithmetic works |
| Multiple independent groups | Confirms separate buckets form correctly |
| All single-character strings | Validates empty-signature handling |
| Different lengths | Ensures incompatible lengths are separated |
| Duplicate strings | Confirms duplicates remain grouped |
| No matching groups | Tests isolated patterns |
Edge Cases
Single-character strings
A single-character string has no adjacent character differences, so its signature becomes an empty tuple. This means all one-letter strings belong to the same group.
This case can easily cause bugs if the implementation assumes every string has at least one difference value. Our implementation handles it naturally because the loop over adjacent characters simply never runs.
Alphabet wraparound
Strings like "az" and "ba" are equivalent because shifting "az" forward once produces "ba".
A naive subtraction would produce -1 for 'a' - 'b', which would incorrectly separate these strings. The implementation fixes this by using modulo arithmetic:
(ord(current) - ord(previous)) % 26
This converts negative differences into their correct cyclic representation.
Strings with different lengths
Strings of different lengths can never belong to the same shifting sequence because shifting preserves string length.
For example:
"ab"and"abc"cannot match
Our signature construction inherently handles this because signatures contain one fewer element than the string length. Different lengths therefore produce signatures of different sizes.