LeetCode 389 - Find the Difference
This problem gives us two strings, s and t. The string t is created by taking all characters from s, shuffling them into a different order, and then inserting exactly one additional character somewhere in the string. Our task is to identify and return that extra character.
Difficulty: 🟢 Easy
Topics: Hash Table, String, Bit Manipulation, Sorting
Solution
Problem Understanding
This problem gives us two strings, s and t. The string t is created by taking all characters from s, shuffling them into a different order, and then inserting exactly one additional character somewhere in the string.
Our task is to identify and return that extra character.
In other words, every character in s appears in t the same number of times, except for one additional letter that appears exactly once more in t. We must determine which character that is.
For example:
- If
s = "abcd"andt = "abcde", the only extra character is'e'. - If
s = ""andt = "y", thentcontains only the added character, so the answer is'y'.
The constraints are relatively small:
0 <= s.length <= 1000t.length == s.length + 1- Both strings contain only lowercase English letters.
These constraints tell us several important things:
- The input size is small enough that even less efficient approaches could work.
- Since only lowercase English letters are used, there are at most 26 distinct characters.
- The problem guarantees there is exactly one extra character in
t. - We never need to handle invalid input or multiple added characters.
Important edge cases include:
- An empty
s, wheretcontains only one character. - Repeated characters, such as
s = "aabb"andt = "ababb". - The extra character being the same as an existing character already present in
s. - The extra character appearing at the beginning, middle, or end of
t, although shuffling means position should not matter.
A naive implementation that only checks positions directly would fail because the characters are shuffled. The solution must compare character frequencies or use a property that is independent of ordering.
Approaches
Brute Force Approach
A straightforward way to solve this problem is to count how many times each character appears in both strings.
We can:
- Count the frequency of every character in
s. - Count the frequency of every character in
t. - Compare the counts character by character.
- Return the character whose count is larger in
t.
This works because the problem guarantees exactly one additional character exists.
One possible brute-force implementation is:
- For every character in
t, count how many times it appears in both strings using repeated scans. - The first character whose counts differ is the answer.
However, repeatedly scanning the strings for each character is inefficient. If we check frequencies using repeated counting operations, the time complexity can become quadratic.
Optimal Approach
The optimal solution uses the XOR bitwise operation.
The key observation is:
- XOR of a number with itself is
0. - XOR of a number with
0is the number itself. - XOR is commutative and associative, so order does not matter.
If we XOR together all characters from s and all characters from t, every matching character cancels out, leaving only the extra character.
For example:
a ^ b ^ c ^ d ^ a ^ b ^ c ^ d ^ e
All matching pairs disappear:
0 ^ 0 ^ 0 ^ 0 ^ e = e
This gives us a very elegant and efficient solution with constant extra space.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Repeatedly counts characters by scanning strings |
| Optimal | O(n) | O(1) | Uses XOR cancellation property to isolate extra character |
Algorithm Walkthrough
Optimal XOR Algorithm
- Initialize a variable called
xor_resultto0.
This variable will store the cumulative XOR of all characters from both strings.
2. Iterate through every character in s.
Convert each character to its ASCII integer value using ord() in Python, then XOR it into xor_result.
3. Iterate through every character in t.
Again, convert each character to its ASCII value and XOR it into xor_result.
4. Every matching character from s and t cancels out.
Since XORing the same value twice produces 0, all shared characters disappear regardless of order.
5. The only remaining value in xor_result is the ASCII value of the extra character.
6. Convert the final integer back into a character using chr() and return it.
Why it works
The correctness relies on the cancellation property of XOR:
x ^ x = 0
Since every character from s appears in t, each matching pair cancels out during XOR accumulation. XOR is independent of ordering, so shuffling does not matter. The only character without a matching pair is the added character, which remains after all cancellations.
Python Solution
class Solution:
def findTheDifference(self, s: str, t: str) -> str:
xor_result = 0
for char in s:
xor_result ^= ord(char)
for char in t:
xor_result ^= ord(char)
return chr(xor_result)
The implementation starts by initializing xor_result to zero. This serves as the running XOR accumulator.
The first loop processes every character in s. Each character is converted into its ASCII integer representation using ord(), then XORed into the accumulator.
The second loop does the same for every character in t.
Because matching characters appear once in each string, they cancel each other out. After both loops finish, the accumulator contains only the ASCII value of the extra character.
Finally, chr() converts that integer back into a character, which is returned as the answer.
Go Solution
func findTheDifference(s string, t string) byte {
var xorResult byte = 0
for i := 0; i < len(s); i++ {
xorResult ^= s[i]
}
for i := 0; i < len(t); i++ {
xorResult ^= t[i]
}
return xorResult
}
The Go implementation follows the same XOR strategy as the Python version.
Since Go strings are byte slices internally for ASCII characters, we can directly XOR the bytes without converting them through functions like ord() or chr().
The result is stored in a byte variable because the problem only contains lowercase English letters, which fit naturally into a single byte.
Empty strings are handled automatically because iterating over an empty string simply performs zero iterations.
Worked Examples
Example 1
Input:
s = "abcd"
t = "abcde"
We process all characters using XOR.
| Step | Character | XOR Result |
|---|---|---|
| Start | 0 | |
| 1 | 'a' | 97 |
| 2 | 'b' | 3 |
| 3 | 'c' | 96 |
| 4 | 'd' | 4 |
| 5 | 'a' | 101 |
| 6 | 'b' | 7 |
| 7 | 'c' | 100 |
| 8 | 'd' | 0 |
| 9 | 'e' | 101 |
ASCII value 101 corresponds to 'e'.
Output: "e"
Example 2
Input:
s = ""
t = "y"
| Step | Character | XOR Result |
|---|---|---|
| Start | 0 | |
| 1 | 'y' | 121 |
ASCII value 121 corresponds to 'y'.
Output: "y"
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each character from both strings is processed exactly once |
| Space | O(1) | Only a single XOR accumulator variable is used |
The algorithm performs one pass through s and one pass through t. Since t has length n + 1, the total work is linear in the size of the input.
The space complexity is constant because the algorithm does not allocate any additional data structures proportional to the input size.
Test Cases
solution = Solution()
assert solution.findTheDifference("abcd", "abcde") == "e" # basic example
assert solution.findTheDifference("", "y") == "y" # empty s
assert solution.findTheDifference("a", "aa") == "a" # repeated same character
assert solution.findTheDifference("ae", "aea") == "a" # extra existing character
assert solution.findTheDifference("xyz", "xayz") == "a" # extra character in middle
assert solution.findTheDifference("hello", "ohellol") == "l" # shuffled input
assert solution.findTheDifference("aabbcc", "abcbacdc") == "d" # repeated characters
assert solution.findTheDifference("mno", "onmp") == "p" # shuffled with new char
assert solution.findTheDifference("z", "zz") == "z" # smallest repeated boundary
assert solution.findTheDifference("abcdefghijklmnopqrstuvwxyz",
"abcdefghijklmnopqrstuvwxyzq") == "q" # long input
| Test | Why |
|---|---|
"abcd", "abcde" |
Validates the standard example |
"", "y" |
Verifies handling of empty input |
"a", "aa" |
Tests repeated identical characters |
"ae", "aea" |
Ensures extra character can already exist in s |
"xyz", "xayz" |
Tests insertion in the middle |
"hello", "ohellol" |
Confirms shuffling does not matter |
"aabbcc", "abcbacdc" |
Validates repeated frequency handling |
"mno", "onmp" |
Tests unordered matching |
"z", "zz" |
Small boundary case |
| alphabet example | Tests larger valid input |
Edge Cases
Empty Original String
When s is empty, t contains exactly one character. A poorly designed solution might assume both strings contain matching characters and fail on empty input.
The XOR solution handles this naturally. Since no characters are processed from s, the accumulator remains 0 until the single character from t is XORed in. That character becomes the final answer directly.
Repeated Characters
Cases like:
s = "aabb"
t = "ababb"
can cause bugs in implementations that only check for existence instead of frequency.
The XOR solution works correctly because every matching occurrence cancels independently. Only the unmatched occurrence remains.
Extra Character Already Exists in s
A common mistake is assuming the added character must be new.
For example:
s = "abc"
t = "abcc"
The extra character is 'c', even though 'c' already existed in s.
The XOR approach correctly handles this because three 'c' values reduce to one unmatched 'c' after pairwise cancellation.
Shuffled Order
Since t is shuffled, characters do not appear in corresponding positions.
A naive position-by-position comparison would fail for inputs like:
s = "abcd"
t = "dacbe"
The XOR operation is order independent because XOR is commutative and associative. This guarantees correctness regardless of character arrangement.