LeetCode 1554 - Strings Differ by One Character
This problem gives us an array of unique strings called dict, where every string has the same length. We must determine whether there exists at least one pair of strings that differ by exactly one character at the same position.
Difficulty: 🟡 Medium
Topics: Hash Table, String, Rolling Hash, Hash Function
Solution
Problem Understanding
This problem gives us an array of unique strings called dict, where every string has the same length. We must determine whether there exists at least one pair of strings that differ by exactly one character at the same position.
In other words, we are looking for two strings where:
- Their lengths are equal, which is already guaranteed.
- All characters match except one.
- The differing character must occur at the same index.
For example:
"abcd"and"aacd"differ only at index1."abcd"and"abyd"differ only at index2.
The expected output is:
trueif such a pair exists.falseotherwise.
The constraints are important:
- The total number of characters across all strings is at most
10^5. - All strings are the same length.
- Strings are unique.
- Only lowercase English letters are used.
These constraints immediately tell us that a naive comparison of every pair may become too slow. If there are n strings of length m, comparing every pair costs O(n^2 * m), which is not feasible when n is large.
Several edge cases are important:
- Very small inputs, such as one string only.
- Strings differing in multiple positions.
- Strings that become identical after removing one character at the same index.
- Cases where many strings share large common prefixes or suffixes.
- Inputs with maximum allowed size, where efficiency matters significantly.
The uniqueness guarantee simplifies the problem because we never need to worry about identical strings being mistakenly counted.
Approaches
Brute Force Approach
The most direct solution is to compare every pair of strings.
For each pair:
- Compare characters position by position.
- Count how many indices differ.
- If exactly one index differs, return
true.
If we finish checking all pairs without finding such a match, return false.
This works because it explicitly verifies the condition for every possible pair. However, it is too slow for large inputs.
Suppose:
nis the number of stringsmis the string length
There are O(n^2) pairs, and each comparison costs O(m). Therefore, total complexity becomes O(n^2 * m).
With up to 10^5 total characters, this approach can easily time out.
Optimal Approach
The key observation is this:
If two strings differ by exactly one character at index i, then removing the character at index i from both strings produces the same result.
Example:
abcd -> remove index 1 -> acd
aacd -> remove index 1 -> acd
This transforms the problem into:
Does any generated "masked" string appear more than once?
We iterate through each character position:
- Remove the character at that position from every string.
- Store the resulting string in a hash set.
- If we ever generate the same masked string twice, then two original strings differed only at that position.
Hash sets provide O(1) average lookup time, making the solution efficient.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^2 * m) |
O(1) |
Compares every pair directly |
| Optimal | O(n * m^2) |
O(n) |
Uses masked strings and hashing |
Algorithm Walkthrough
Optimal Algorithm
- Let
word_lengthbe the length of each string.
Since all strings have the same length, we only need to iterate through positions from 0 to word_length - 1.
2. For each character position i, create an empty hash set called seen.
This set stores all masked versions generated by removing the character at index i.
3. Iterate through every string in dict.
For each string:
- Create a new string by removing the character at index
i. - Example:
"abcd" with i = 2 becomes "abd"
- Check whether the masked string already exists in
seen.
- If it does, then another string produced the same masked result.
- This means the two original strings differ only at index
i. - Return
true.
- Otherwise, insert the masked string into
seen. - Continue processing all strings for the current index.
- If no duplicates are found for any index, return
false.
Why it works
Two strings differ by exactly one character at the same index if and only if removing that index from both strings produces identical remaining characters.
The algorithm systematically checks this property for every possible index. Because hash sets efficiently detect duplicates, any valid pair is discovered immediately.
Python Solution
from typing import List
class Solution:
def differByOne(self, dict: List[str]) -> bool:
word_length = len(dict[0])
for index in range(word_length):
seen = set()
for word in dict:
masked = word[:index] + word[index + 1:]
if masked in seen:
return True
seen.add(masked)
return False
The implementation follows the algorithm directly.
First, we determine the common string length. Then, for every index position, we create a fresh hash set named seen.
For each word, we construct a masked version by excluding the current character. Python slicing makes this concise:
word[:index] + word[index + 1:]
If the masked string already exists in the set, then another word matched everywhere except at the removed index. We immediately return True.
Otherwise, we insert the masked string and continue.
If all positions are processed without duplicates, no valid pair exists, so the function returns False.
Go Solution
func differByOne(dict []string) bool {
wordLength := len(dict[0])
for index := 0; index < wordLength; index++ {
seen := make(map[string]bool)
for _, word := range dict {
masked := word[:index] + word[index+1:]
if seen[masked] {
return true
}
seen[masked] = true
}
}
return false
}
The Go implementation closely mirrors the Python version.
Instead of a Python set, Go uses map[string]bool for constant time membership checks.
String slicing in Go works similarly to Python:
word[:index] + word[index+1:]
Because all strings contain only lowercase English letters, byte indexing is safe and efficient.
No integer overflow concerns exist here because we only use indices and string operations.
Worked Examples
Example 1
Input: ["abcd","acbd","aacd"]
Step-by-step trace
We process each index separately.
Index = 0
| Word | Masked String | Seen Before | Action |
|---|---|---|---|
| abcd | bcd | No | Add |
| acbd | cbd | No | Add |
| aacd | acd | No | Add |
No duplicates found.
Index = 1
| Word | Masked String | Seen Before | Action |
|---|---|---|---|
| abcd | acd | No | Add |
| acbd | abd | No | Add |
| aacd | acd | Yes | Return true |
The algorithm returns true.
The matching pair is:
abcd
aacd
They differ only at index 1.
Example 2
Input: ["ab","cd","yz"]
Index = 0
| Word | Masked String |
|---|---|
| ab | b |
| cd | d |
| yz | z |
No duplicates.
Index = 1
| Word | Masked String |
|---|---|
| ab | a |
| cd | c |
| yz | y |
No duplicates.
Result:
false
Example 3
Input: ["abcd","cccc","abyd","abab"]
Index = 2
| Word | Masked String |
|---|---|
| abcd | abd |
| cccc | ccc |
| abyd | abd |
Duplicate "abd" appears.
Return true.
The matching pair is:
abcd
abyd
They differ only at index 2.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n * m^2) |
There are m positions, and for each position we process n strings while creating masked strings of length m |
| Space | O(n) |
The hash set stores up to n masked strings |
The dominant cost comes from string creation. Removing one character creates a new string of length m - 1, which costs O(m) time. Since we do this for every string and every position, total complexity becomes O(n * m^2).
Given the problem constraint that total characters are at most 10^5, this solution is efficient enough in practice.
Test Cases
solution = Solution()
assert solution.differByOne(["abcd", "acbd", "aacd"]) is True # basic positive case
assert solution.differByOne(["ab", "cd", "yz"]) is False # no matching pair
assert solution.differByOne(["abcd", "cccc", "abyd", "abab"]) is True # difference in middle
assert solution.differByOne(["a", "b"]) is True # single-character strings
assert solution.differByOne(["ab", "ac"]) is True # differ at last position
assert solution.differByOne(["ab", "cb"]) is True # differ at first position
assert solution.differByOne(["abc", "def", "ghi"]) is False # all completely different
assert solution.differByOne(["aaaa", "aaab", "aaba"]) is True # multiple near matches
assert solution.differByOne(["xyz", "xzz", "yyy"]) is True # one valid pair exists
assert solution.differByOne(["abcd"]) is False # single string only
assert solution.differByOne([
"abcde",
"bbcde",
"cbcde",
"dbcde"
]) is True # many strings sharing suffix
assert solution.differByOne([
"abc",
"abd",
"acd"
]) is True # overlapping relationships
assert solution.differByOne([
"mnop",
"qrst",
"uvwx",
"yzab"
]) is False # no valid pair
Test Summary
| Test | Why |
|---|---|
["abcd", "acbd", "aacd"] |
Standard positive example |
["ab", "cd", "yz"] |
No valid pair exists |
["abcd", "cccc", "abyd", "abab"] |
Match occurs in middle position |
["a", "b"] |
Smallest possible string length |
["ab", "ac"] |
Difference at last index |
["ab", "cb"] |
Difference at first index |
["abc", "def", "ghi"] |
Completely unrelated strings |
["aaaa", "aaab", "aaba"] |
Multiple close strings |
["xyz", "xzz", "yyy"] |
One valid pair among unrelated strings |
["abcd"] |
Single-element input |
| Shared suffix example | Tests repeated masked forms |
["abc", "abd", "acd"] |
Multiple candidate pairs |
| Completely distinct 4-string case | Ensures false negatives are avoided |
Edge Cases
Single String Input
If the input contains only one string, there cannot possibly be a pair differing by one character. A careless implementation might assume at least two strings exist and accidentally produce incorrect results.
This implementation handles it naturally. The loops execute normally, but no duplicate masked strings can ever appear, so the function correctly returns False.
Single Character Strings
Inputs like:
["a", "b"]
are important because removing the only character produces an empty string.
Both strings generate:
""
at index 0, which correctly indicates they differ by exactly one character.
The implementation handles this correctly because empty strings are valid hash set keys.
Multiple Differences Between Strings
Consider:
["abcd", "wxyz"]
These strings differ in every position.
Removing any single character still produces different masked strings:
bcd vs xyz
acd vs wyz
abd vs wxz
abc vs wxy
Therefore, no duplicate appears in the hash set, and the algorithm correctly returns False.
Shared Prefixes or Suffixes
Inputs with many similar strings can sometimes expose hashing or masking mistakes.
Example:
["aaaa", "aaab", "aaba"]
The implementation correctly checks every index independently. Even though many masked strings are similar, duplicates only occur when the original strings differ at exactly one position.