LeetCode 2351 - First Letter to Appear Twice
The problem gives us a string s containing only lowercase English letters. We need to return the first letter whose second occurrence appears earliest in the string. This detail is extremely important.
Difficulty: 🟢 Easy
Topics: Hash Table, String, Bit Manipulation, Counting
Solution
Problem Understanding
The problem gives us a string s containing only lowercase English letters. We need to return the first letter whose second occurrence appears earliest in the string.
This detail is extremely important. The problem is not asking for the first character that repeats somewhere later. Instead, it asks which character reaches its second appearance first as we scan from left to right.
For example, in "abccbaacz":
'a'appears at indices0,5, and6'b'appears at indices1and4'c'appears at indices2,3, and7
Although 'a' appears earlier than 'c', the second occurrence of 'c' happens at index 3, which is earlier than the second occurrence of 'b' at index 4 and 'a' at index 5. Therefore, the answer is 'c'.
The input size is very small:
2 <= s.length <= 100
This tells us that even inefficient solutions would work in practice. However, the goal is to design a clean and optimal algorithm.
The string contains only lowercase English letters, which means there are at most 26 distinct characters. This constraint opens the door for efficient counting or bit manipulation solutions.
The problem also guarantees that at least one letter appears more than once. That means we never need to worry about returning a default value or handling a case with no duplicates.
Some important edge cases include:
- A duplicate appearing immediately, such as
"aabc". - A string where only the final character repeats, such as
"abcdd". - A character appearing more than twice, such as
"abcaad". - Multiple repeating characters where the earliest second occurrence determines the answer.
A naive implementation might incorrectly return the first character that appears multiple times in total, instead of the character whose second occurrence appears earliest.
Approaches
Brute Force Approach
A straightforward solution is to check every character and count how many times it has appeared up to the current position.
For each index i, we can scan all previous indices 0 through i - 1 and check whether the current character has appeared before. The first time we find a match, we return that character.
This works because the first repeated character encountered while scanning from left to right is exactly the character whose second occurrence appears earliest.
However, this approach uses nested loops. For every character, we may scan all previous characters, leading to quadratic time complexity.
Even though the input size is small enough for this to pass, it is not the most efficient or elegant solution.
Optimal Approach
The key observation is that we only need to know whether we have seen a character before.
As we scan the string from left to right:
- If the current character has not been seen, we record it.
- If the current character has already been seen, then this is its second occurrence, and since we are scanning left to right, it must be the earliest second occurrence encountered so far.
A hash set is a natural choice because it provides average O(1) lookup and insertion operations.
Since there are only lowercase English letters, we could also use a fixed-size boolean array or a bitmask. However, a set keeps the implementation simple and readable.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Compare each character with all previous characters |
| Optimal | O(n) | O(1) | Use a set to track previously seen characters |
Although the optimal solution technically uses extra space, the space is bounded by 26 lowercase letters, so it is effectively constant.
Algorithm Walkthrough
- Create an empty set called
seen.
This set will store all characters we have encountered so far while scanning the string. 2. Iterate through the string one character at a time from left to right.
The left to right traversal is critical because the first repeated character encountered during this scan is guaranteed to have the earliest second occurrence.
3. For each character, check whether it already exists in seen.
- If it does, immediately return that character.
- If it does not, add it to the set.
- Continue until a repeated character is found.
The problem guarantees that at least one repeated character exists, so the algorithm will always return a value before the loop ends.
Why it works
The algorithm maintains the invariant that seen contains exactly the characters encountered before the current index.
When we encounter a character already present in seen, we know this is its second occurrence. Since we scan from left to right, this is the earliest second occurrence among all characters processed so far. Therefore, the first repeated character we encounter is the correct answer.
Python Solution
class Solution:
def repeatedCharacter(self, s: str) -> str:
seen = set()
for char in s:
if char in seen:
return char
seen.add(char)
return ""
The solution starts by initializing an empty set named seen. This set keeps track of all characters encountered so far.
The loop iterates through each character in the string. For every character, the code first checks whether it already exists in the set.
If the character is already present, that means we have found its second occurrence, so we immediately return it.
Otherwise, we add the character to the set and continue scanning.
The final return "" statement is included only to satisfy function completeness. According to the problem constraints, execution will always return from inside the loop because the input is guaranteed to contain a repeated character.
Go Solution
func repeatedCharacter(s string) byte {
seen := make(map[byte]bool)
for i := 0; i < len(s); i++ {
if seen[s[i]] {
return s[i]
}
seen[s[i]] = true
}
return 0
}
The Go implementation follows the same logic as the Python version.
Instead of a Python set, Go uses a map[byte]bool to track previously seen characters. Since the input consists only of lowercase English letters, treating characters as byte values is efficient and straightforward.
The function returns a byte because that is the required signature in the problem statement.
The final return 0 is technically unreachable due to the problem guarantee that at least one repeated character exists.
Worked Examples
Example 1
Input:
s = "abccbaacz"
| Step | Current Character | Seen Set Before | Action | Seen Set After |
|---|---|---|---|---|
| 1 | 'a' |
{} |
Add 'a' |
{'a'} |
| 2 | 'b' |
{'a'} |
Add 'b' |
{'a', 'b'} |
| 3 | 'c' |
{'a', 'b'} |
Add 'c' |
{'a', 'b', 'c'} |
| 4 | 'c' |
{'a', 'b', 'c'} |
Already seen, return 'c' |
- |
The algorithm stops immediately when the second 'c' is encountered.
Output:
"c"
Example 2
Input:
s = "abcdd"
| Step | Current Character | Seen Set Before | Action | Seen Set After |
|---|---|---|---|---|
| 1 | 'a' |
{} |
Add 'a' |
{'a'} |
| 2 | 'b' |
{'a'} |
Add 'b' |
{'a', 'b'} |
| 3 | 'c' |
{'a', 'b'} |
Add 'c' |
{'a', 'b', 'c'} |
| 4 | 'd' |
{'a', 'b', 'c'} |
Add 'd' |
{'a', 'b', 'c', 'd'} |
| 5 | 'd' |
{'a', 'b', 'c', 'd'} |
Already seen, return 'd' |
- |
Output:
"d"
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each character is processed once |
| Space | O(1) | At most 26 lowercase letters are stored |
The algorithm scans the string exactly once, so the running time grows linearly with the input length.
The auxiliary space remains constant because the set can contain at most 26 lowercase English letters, regardless of input size.
Test Cases
solution = Solution()
assert solution.repeatedCharacter("abccbaacz") == "c" # provided example
assert solution.repeatedCharacter("abcdd") == "d" # duplicate at end
assert solution.repeatedCharacter("aabc") == "a" # immediate repetition
assert solution.repeatedCharacter("abca") == "a" # repeat after gap
assert solution.repeatedCharacter("abcdefgg") == "g" # only last character repeats
assert solution.repeatedCharacter("zz") == "z" # minimum length input
assert solution.repeatedCharacter("abcbad") == "b" # second occurrence earliest
assert solution.repeatedCharacter("aaab") == "a" # multiple repetitions of same character
assert solution.repeatedCharacter("abcdedcba") == "d" # middle character repeats first
| Test | Why |
|---|---|
"abccbaacz" |
Validates the main example |
"abcdd" |
Tests repetition at the end |
"aabc" |
Tests immediate duplicate |
"abca" |
Tests delayed repetition |
"abcdefgg" |
Ensures late duplicate works correctly |
"zz" |
Tests smallest valid input |
"abcbad" |
Verifies earliest second occurrence logic |
"aaab" |
Tests repeated occurrences beyond two |
"abcdedcba" |
Confirms correct left to right behavior |
Edge Cases
One important edge case is when the repeated character appears immediately, such as "aabc". A buggy implementation might incorrectly continue scanning after finding the first duplicate. This solution returns immediately upon detecting the second occurrence, so it correctly returns 'a'.
Another important case is when multiple characters repeat, but their second occurrences happen at different positions. For example, in "abccbaacz", both 'a' and 'b' repeat, but 'c' reaches its second occurrence first. The left to right scan guarantees that the earliest second occurrence is returned correctly.
A third edge case involves characters appearing more than twice, such as "aaab" or "abcaad". The algorithm does not care how many total occurrences exist. It only checks whether the current character has already appeared before. Therefore, the second occurrence is detected correctly regardless of additional repetitions later in the string.
Another subtle case is the minimum valid input size, such as "zz". The first character is added to the set, and the second character immediately triggers the duplicate detection logic. This confirms the implementation handles the smallest possible input correctly.