LeetCode 1554 - Strings Differ by One Character

This problem gives us an array of unique strings called dict, where every string has the same length. We must determine whether there exists at least one pair of strings that differ by exactly one character at the same position.

LeetCode Problem 1554

Difficulty: 🟡 Medium
Topics: Hash Table, String, Rolling Hash, Hash Function

Solution

Problem Understanding

This problem gives us an array of unique strings called dict, where every string has the same length. We must determine whether there exists at least one pair of strings that differ by exactly one character at the same position.

In other words, we are looking for two strings where:

  • Their lengths are equal, which is already guaranteed.
  • All characters match except one.
  • The differing character must occur at the same index.

For example:

  • "abcd" and "aacd" differ only at index 1.
  • "abcd" and "abyd" differ only at index 2.

The expected output is:

  • true if such a pair exists.
  • false otherwise.

The constraints are important:

  • The total number of characters across all strings is at most 10^5.
  • All strings are the same length.
  • Strings are unique.
  • Only lowercase English letters are used.

These constraints immediately tell us that a naive comparison of every pair may become too slow. If there are n strings of length m, comparing every pair costs O(n^2 * m), which is not feasible when n is large.

Several edge cases are important:

  • Very small inputs, such as one string only.
  • Strings differing in multiple positions.
  • Strings that become identical after removing one character at the same index.
  • Cases where many strings share large common prefixes or suffixes.
  • Inputs with maximum allowed size, where efficiency matters significantly.

The uniqueness guarantee simplifies the problem because we never need to worry about identical strings being mistakenly counted.

Approaches

Brute Force Approach

The most direct solution is to compare every pair of strings.

For each pair:

  1. Compare characters position by position.
  2. Count how many indices differ.
  3. If exactly one index differs, return true.

If we finish checking all pairs without finding such a match, return false.

This works because it explicitly verifies the condition for every possible pair. However, it is too slow for large inputs.

Suppose:

  • n is the number of strings
  • m is the string length

There are O(n^2) pairs, and each comparison costs O(m). Therefore, total complexity becomes O(n^2 * m).

With up to 10^5 total characters, this approach can easily time out.

Optimal Approach

The key observation is this:

If two strings differ by exactly one character at index i, then removing the character at index i from both strings produces the same result.

Example:

abcd  -> remove index 1 -> acd
aacd  -> remove index 1 -> acd

This transforms the problem into:

Does any generated "masked" string appear more than once?

We iterate through each character position:

  1. Remove the character at that position from every string.
  2. Store the resulting string in a hash set.
  3. If we ever generate the same masked string twice, then two original strings differed only at that position.

Hash sets provide O(1) average lookup time, making the solution efficient.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n^2 * m) O(1) Compares every pair directly
Optimal O(n * m^2) O(n) Uses masked strings and hashing

Algorithm Walkthrough

Optimal Algorithm

  1. Let word_length be the length of each string.

Since all strings have the same length, we only need to iterate through positions from 0 to word_length - 1. 2. For each character position i, create an empty hash set called seen.

This set stores all masked versions generated by removing the character at index i. 3. Iterate through every string in dict.

For each string:

  • Create a new string by removing the character at index i.
  • Example:
"abcd" with i = 2 becomes "abd"
  1. Check whether the masked string already exists in seen.
  • If it does, then another string produced the same masked result.
  • This means the two original strings differ only at index i.
  • Return true.
  1. Otherwise, insert the masked string into seen.
  2. Continue processing all strings for the current index.
  3. If no duplicates are found for any index, return false.

Why it works

Two strings differ by exactly one character at the same index if and only if removing that index from both strings produces identical remaining characters.

The algorithm systematically checks this property for every possible index. Because hash sets efficiently detect duplicates, any valid pair is discovered immediately.

Python Solution

from typing import List

class Solution:
    def differByOne(self, dict: List[str]) -> bool:
        word_length = len(dict[0])

        for index in range(word_length):
            seen = set()

            for word in dict:
                masked = word[:index] + word[index + 1:]

                if masked in seen:
                    return True

                seen.add(masked)

        return False

The implementation follows the algorithm directly.

First, we determine the common string length. Then, for every index position, we create a fresh hash set named seen.

For each word, we construct a masked version by excluding the current character. Python slicing makes this concise:

word[:index] + word[index + 1:]

If the masked string already exists in the set, then another word matched everywhere except at the removed index. We immediately return True.

Otherwise, we insert the masked string and continue.

If all positions are processed without duplicates, no valid pair exists, so the function returns False.

Go Solution

func differByOne(dict []string) bool {
    wordLength := len(dict[0])

    for index := 0; index < wordLength; index++ {
        seen := make(map[string]bool)

        for _, word := range dict {
            masked := word[:index] + word[index+1:]

            if seen[masked] {
                return true
            }

            seen[masked] = true
        }
    }

    return false
}

The Go implementation closely mirrors the Python version.

Instead of a Python set, Go uses map[string]bool for constant time membership checks.

String slicing in Go works similarly to Python:

word[:index] + word[index+1:]

Because all strings contain only lowercase English letters, byte indexing is safe and efficient.

No integer overflow concerns exist here because we only use indices and string operations.

Worked Examples

Example 1

Input: ["abcd","acbd","aacd"]

Step-by-step trace

We process each index separately.

Index = 0

Word Masked String Seen Before Action
abcd bcd No Add
acbd cbd No Add
aacd acd No Add

No duplicates found.

Index = 1

Word Masked String Seen Before Action
abcd acd No Add
acbd abd No Add
aacd acd Yes Return true

The algorithm returns true.

The matching pair is:

abcd
aacd

They differ only at index 1.

Example 2

Input: ["ab","cd","yz"]

Index = 0

Word Masked String
ab b
cd d
yz z

No duplicates.

Index = 1

Word Masked String
ab a
cd c
yz y

No duplicates.

Result:

false

Example 3

Input: ["abcd","cccc","abyd","abab"]

Index = 2

Word Masked String
abcd abd
cccc ccc
abyd abd

Duplicate "abd" appears.

Return true.

The matching pair is:

abcd
abyd

They differ only at index 2.

Complexity Analysis

Measure Complexity Explanation
Time O(n * m^2) There are m positions, and for each position we process n strings while creating masked strings of length m
Space O(n) The hash set stores up to n masked strings

The dominant cost comes from string creation. Removing one character creates a new string of length m - 1, which costs O(m) time. Since we do this for every string and every position, total complexity becomes O(n * m^2).

Given the problem constraint that total characters are at most 10^5, this solution is efficient enough in practice.

Test Cases

solution = Solution()

assert solution.differByOne(["abcd", "acbd", "aacd"]) is True  # basic positive case
assert solution.differByOne(["ab", "cd", "yz"]) is False  # no matching pair
assert solution.differByOne(["abcd", "cccc", "abyd", "abab"]) is True  # difference in middle

assert solution.differByOne(["a", "b"]) is True  # single-character strings
assert solution.differByOne(["ab", "ac"]) is True  # differ at last position
assert solution.differByOne(["ab", "cb"]) is True  # differ at first position

assert solution.differByOne(["abc", "def", "ghi"]) is False  # all completely different
assert solution.differByOne(["aaaa", "aaab", "aaba"]) is True  # multiple near matches
assert solution.differByOne(["xyz", "xzz", "yyy"]) is True  # one valid pair exists

assert solution.differByOne(["abcd"]) is False  # single string only

assert solution.differByOne([
    "abcde",
    "bbcde",
    "cbcde",
    "dbcde"
]) is True  # many strings sharing suffix

assert solution.differByOne([
    "abc",
    "abd",
    "acd"
]) is True  # overlapping relationships

assert solution.differByOne([
    "mnop",
    "qrst",
    "uvwx",
    "yzab"
]) is False  # no valid pair

Test Summary

Test Why
["abcd", "acbd", "aacd"] Standard positive example
["ab", "cd", "yz"] No valid pair exists
["abcd", "cccc", "abyd", "abab"] Match occurs in middle position
["a", "b"] Smallest possible string length
["ab", "ac"] Difference at last index
["ab", "cb"] Difference at first index
["abc", "def", "ghi"] Completely unrelated strings
["aaaa", "aaab", "aaba"] Multiple close strings
["xyz", "xzz", "yyy"] One valid pair among unrelated strings
["abcd"] Single-element input
Shared suffix example Tests repeated masked forms
["abc", "abd", "acd"] Multiple candidate pairs
Completely distinct 4-string case Ensures false negatives are avoided

Edge Cases

Single String Input

If the input contains only one string, there cannot possibly be a pair differing by one character. A careless implementation might assume at least two strings exist and accidentally produce incorrect results.

This implementation handles it naturally. The loops execute normally, but no duplicate masked strings can ever appear, so the function correctly returns False.

Single Character Strings

Inputs like:

["a", "b"]

are important because removing the only character produces an empty string.

Both strings generate:

""

at index 0, which correctly indicates they differ by exactly one character.

The implementation handles this correctly because empty strings are valid hash set keys.

Multiple Differences Between Strings

Consider:

["abcd", "wxyz"]

These strings differ in every position.

Removing any single character still produces different masked strings:

bcd vs xyz
acd vs wyz
abd vs wxz
abc vs wxy

Therefore, no duplicate appears in the hash set, and the algorithm correctly returns False.

Shared Prefixes or Suffixes

Inputs with many similar strings can sometimes expose hashing or masking mistakes.

Example:

["aaaa", "aaab", "aaba"]

The implementation correctly checks every index independently. Even though many masked strings are similar, duplicates only occur when the original strings differ at exactly one position.