LeetCode 249 - Group Shifted Strings

The problem asks us to group strings that belong to the same cyclic shifting sequence. Two strings belong to the same group if one can be transformed into the other by repeatedly shifting every character forward or backward in the alphabet, with wraparound between 'z' and 'a'.

LeetCode Problem 249

Difficulty: 🟔 Medium
Topics: Array, Hash Table, String

Solution

Problem Understanding

The problem asks us to group strings that belong to the same cyclic shifting sequence. Two strings belong to the same group if one can be transformed into the other by repeatedly shifting every character forward or backward in the alphabet, with wraparound between 'z' and 'a'.

For example, "abc" becomes "bcd" after one right shift, and "xyz" also becomes "yza" after one right shift. Because shifting preserves the relative distance between characters, "abc", "bcd", and "xyz" all belong to the same sequence.

The input is an array of lowercase strings. The output should be a list of groups, where every string inside a group belongs to the same shifting sequence. The order of the groups does not matter, and the order of strings within each group also does not matter.

The constraints are relatively small:

  • At most 200 strings
  • Each string has length at most 50

These limits mean even moderately expensive operations are acceptable, but we still want a clean and scalable solution.

The most important observation is that shifting preserves the pattern of differences between adjacent characters. For example:

  • "abc" has differences [1, 1]
  • "bcd" also has differences [1, 1]
  • "xyz" also has differences [1, 1]

Therefore, all three belong to the same group.

There are several important edge cases:

  • Single-character strings like "a" and "z" should always belong to the same group because any single character can shift into any other single character.
  • Wraparound cases such as "az" and "ba" are valid matches because shifting 'z' forward produces 'a'.
  • Strings of different lengths can never belong to the same group because shifting does not change length.
  • A naive implementation can easily fail on wraparound arithmetic if modulo handling is incorrect.

Approaches

Brute Force Approach

A brute-force solution would compare every pair of strings and determine whether they belong to the same shifting sequence.

To compare two strings, we would:

  1. Check whether they have the same length.
  2. Compute the character difference between corresponding letters.
  3. Verify that every position has the same cyclic difference.

For example:

  • "abc" to "bcd" gives shifts [1,1,1]
  • "abc" to "xyz" gives shifts [23,23,23]

If the shift amount is consistent across all characters, the strings belong to the same group.

We could repeatedly scan all strings and merge compatible ones together.

This approach is correct because it explicitly checks the defining property of shifted strings. However, it becomes inefficient because every string may need to be compared with every other string.

Key Insight for the Optimal Approach

The important observation is that strings in the same shifting sequence share the same normalized difference pattern.

Instead of comparing strings pairwise, we can compute a canonical representation for each string.

For example:

  • "abc" → differences (1,1)
  • "bcd" → differences (1,1)
  • "xyz" → differences (1,1)

Similarly:

  • "az" → (25)
  • "ba" → (25)

We can use this difference pattern as a hash map key. Every string with the same key belongs in the same group.

This converts the problem from repeated comparisons into a straightforward grouping problem using a hash table.

Approach Time Complexity Space Complexity Notes
Brute Force O(n² Ɨ m) O(n) Compares every pair of strings
Optimal O(n Ɨ m) O(n Ɨ m) Uses normalized difference pattern as hash key

Here:

  • n is the number of strings
  • m is the maximum string length

Algorithm Walkthrough

Optimal Algorithm

  1. Create a hash map where:
  • The key is a normalized shift pattern
  • The value is a list of strings belonging to that pattern
  1. For each string, compute its signature.

The signature is formed by calculating the cyclic difference between consecutive characters.

For example:

  • "abc"

  • 'b' - 'a' = 1

  • 'c' - 'b' = 1

  • Signature: (1,1)

  • "xyz"

  • 'y' - 'x' = 1

  • 'z' - 'y' = 1

  • Signature: (1,1)

  1. Use modulo arithmetic to handle wraparound cases.

For example:

  • "az"

  • 'z' - 'a' = 25

  • Signature: (25)

  • "ba"

  • 'a' - 'b' = -1

  • (-1 + 26) % 26 = 25

  • Signature: (25)

  1. Insert the string into the hash map bucket corresponding to its signature.
  2. After processing all strings, return all hash map values.

Why it works

The algorithm works because shifting every character in a string by the same amount preserves the relative differences between adjacent characters. Therefore, all strings in the same shifting sequence produce identical difference signatures. Conversely, strings with different signatures cannot belong to the same shifting sequence.

Python Solution

from collections import defaultdict
from typing import List, Tuple

class Solution:
    def groupStrings(self, strings: List[str]) -> List[List[str]]:
        
        def get_signature(s: str) -> Tuple[int, ...]:
            signature = []
            
            for i in range(1, len(s)):
                diff = (ord(s[i]) - ord(s[i - 1])) % 26
                signature.append(diff)
            
            return tuple(signature)
        
        groups = defaultdict(list)
        
        for s in strings:
            key = get_signature(s)
            groups[key].append(s)
        
        return list(groups.values())

The implementation begins by defining a helper function called get_signature. This function computes the normalized difference pattern for a string.

For each adjacent pair of characters, we calculate the cyclic difference using modulo 26 arithmetic. Using modulo ensures wraparound cases work correctly. For example, moving from 'z' to 'a' correctly produces a difference of 1.

The signature is stored as a tuple because tuples are immutable and can be used as dictionary keys.

We then create a defaultdict(list) to group strings sharing the same signature.

As we iterate through every string:

  • Compute its signature
  • Insert the string into the corresponding hash map bucket

Finally, we return all grouped values from the dictionary.

Single-character strings naturally produce an empty tuple (), which correctly groups all one-letter strings together.

Go Solution

package main

func groupStrings(strings []string) [][]string {
    
    groups := make(map[string][]string)

    for _, s := range strings {
        key := getSignature(s)
        groups[key] = append(groups[key], s)
    }

    result := [][]string{}

    for _, group := range groups {
        result = append(result, group)
    }

    return result
}

func getSignature(s string) string {
    if len(s) == 1 {
        return ""
    }

    signature := make([]byte, 0)

    for i := 1; i < len(s); i++ {
        diff := (int(s[i]) - int(s[i-1]) + 26) % 26

        signature = append(signature, byte(diff)+'a')
    }

    return string(signature)
}

The Go solution follows the same overall strategy as the Python version but differs slightly in implementation details.

Since Go slices cannot directly serve as map keys, we convert the signature into a string representation. Each cyclic difference is encoded as a character.

For example:

  • Difference 0 becomes 'a'
  • Difference 1 becomes 'b'

This creates a compact and hashable signature string.

Go maps automatically return zero values for missing keys, so appending to a nil slice works naturally without extra initialization.

There are no integer overflow concerns because all arithmetic stays within small alphabet-size bounds.

Worked Examples

Example 1

Input:

["abc","bcd","acef","xyz","az","ba","a","z"]

We compute signatures one string at a time.

String Differences Signature Group State
"abc" (1,1) (1,1) {(1,1): ["abc"]}
"bcd" (1,1) (1,1) {(1,1): ["abc","bcd"]}
"acef" (2,2,1) (2,2,1) new group
"xyz" (1,1) (1,1) {(1,1): ["abc","bcd","xyz"]}
"az" (25) (25) new group
"ba" (25) (25) {(25): ["az","ba"]}
"a" () () new group
"z" () () {(): ["a","z"]}

Final grouping:

[
  ["abc","bcd","xyz"],
  ["acef"],
  ["az","ba"],
  ["a","z"]
]

Example 2

Input:

["a"]

Processing:

String Signature Group
"a" () {(): ["a"]}

Output:

[["a"]]

Complexity Analysis

Measure Complexity Explanation
Time O(n Ɨ m) Each string is scanned once
Space O(n Ɨ m) Hash map stores all strings and signatures

The algorithm processes every character of every string exactly once while building signatures. Hash map insertions are average-case O(1), so the total running time is linear in the total input size.

The space complexity comes from storing signatures and grouped strings in the dictionary.

Test Cases

from typing import List

class Solution:
    def groupStrings(self, strings: List[str]) -> List[List[str]]:
        from collections import defaultdict

        def get_signature(s):
            signature = []

            for i in range(1, len(s)):
                diff = (ord(s[i]) - ord(s[i - 1])) % 26
                signature.append(diff)

            return tuple(signature)

        groups = defaultdict(list)

        for s in strings:
            groups[get_signature(s)].append(s)

        return list(groups.values())

def normalize(result):
    return sorted([sorted(group) for group in result])

sol = Solution()

# Provided example
assert normalize(
    sol.groupStrings(["abc","bcd","acef","xyz","az","ba","a","z"])
) == normalize([
    ["abc","bcd","xyz"],
    ["acef"],
    ["az","ba"],
    ["a","z"]
])  # standard mixed example

# Single string
assert normalize(
    sol.groupStrings(["a"])
) == normalize([["a"]])  # minimum input

# Wraparound shift
assert normalize(
    sol.groupStrings(["az","ba"])
) == normalize([["az","ba"]])  # verifies modulo arithmetic

# Multiple independent groups
assert normalize(
    sol.groupStrings(["abc","def","ghi","az","yx"])
) == normalize([
    ["abc","def","ghi"],
    ["az","yx"]
])  # multiple valid groups

# All single-character strings
assert normalize(
    sol.groupStrings(["a","b","c","z"])
) == normalize([
    ["a","b","c","z"]
])  # all single letters belong together

# Different lengths
assert normalize(
    sol.groupStrings(["ab","abc","de","fgh"])
) == normalize([
    ["ab","de"],
    ["abc","fgh"]
])  # length-sensitive grouping

# Duplicate strings
assert normalize(
    sol.groupStrings(["abc","abc","bcd"])
) == normalize([
    ["abc","abc","bcd"]
])  # duplicate handling

# No matching groups
assert normalize(
    sol.groupStrings(["ab","cd","ace"])
) == normalize([
    ["ab","cd"],
    ["ace"]
])  # isolated string
Test Why
Standard mixed example Verifies normal grouping behavior
Single string Tests minimum input size
Wraparound shift Ensures modulo arithmetic works
Multiple independent groups Confirms separate buckets form correctly
All single-character strings Validates empty-signature handling
Different lengths Ensures incompatible lengths are separated
Duplicate strings Confirms duplicates remain grouped
No matching groups Tests isolated patterns

Edge Cases

Single-character strings

A single-character string has no adjacent character differences, so its signature becomes an empty tuple. This means all one-letter strings belong to the same group.

This case can easily cause bugs if the implementation assumes every string has at least one difference value. Our implementation handles it naturally because the loop over adjacent characters simply never runs.

Alphabet wraparound

Strings like "az" and "ba" are equivalent because shifting "az" forward once produces "ba".

A naive subtraction would produce -1 for 'a' - 'b', which would incorrectly separate these strings. The implementation fixes this by using modulo arithmetic:

(ord(current) - ord(previous)) % 26

This converts negative differences into their correct cyclic representation.

Strings with different lengths

Strings of different lengths can never belong to the same shifting sequence because shifting preserves string length.

For example:

  • "ab" and "abc" cannot match

Our signature construction inherently handles this because signatures contain one fewer element than the string length. Different lengths therefore produce signatures of different sizes.