LeetCode 387 - First Unique Character in a String

The problem gives us a string s consisting only of lowercase English letters. We must find the index of the first character that appears exactly once in the entire string. If every character appears more than once, we return -1.

LeetCode Problem 387

Difficulty: 🟢 Easy
Topics: Hash Table, String, Queue, Counting

Solution

Problem Understanding

The problem gives us a string s consisting only of lowercase English letters. We must find the index of the first character that appears exactly once in the entire string. If every character appears more than once, we return -1.

The key detail is that we are looking for the first non-repeating character based on its position in the string, not based on alphabetical order. For example, in "loveleetcode", the character 'v' appears only once and occurs earlier than the other unique characters, so the answer is index 2.

The input size can be as large as 10^5, which means we need an efficient solution. A naive solution that repeatedly scans the string for each character could become too slow because it may require quadratic time in the worst case.

Since the string contains only lowercase English letters, there are at most 26 distinct characters. This constraint allows us to use counting structures very efficiently.

Several edge cases are important to recognize immediately. A string may contain only repeated characters, such as "aabb", in which case the answer is -1. A string may contain exactly one character, such as "z", which should return 0. Another important case is when the first unique character appears near the end of the string, meaning we cannot stop early during counting.

Approaches

Brute Force Approach

A straightforward solution is to examine every character one by one and count how many times it appears in the string.

For each index i, we scan the entire string again and count occurrences of s[i]. If the count is exactly 1, then we immediately return i because we are scanning from left to right.

This approach is correct because every character is checked thoroughly, and the first character with frequency 1 is returned. However, it is inefficient because for each character we may scan the entire string again.

If the string length is n, then for every one of the n characters we may perform another O(n) scan. This produces O(n^2) time complexity, which is too slow for n = 10^5.

Optimal Approach

The key observation is that the problem depends entirely on character frequencies. Instead of repeatedly recounting characters, we can count all character frequencies once and store them in a hash map or frequency array.

After computing frequencies, we perform a second pass through the string from left to right. The first character whose frequency is 1 is the answer.

This works efficiently because counting frequencies allows constant-time lookup for each character during the second pass.

Since lowercase English letters are limited to 26 characters, we can use either a hash map or a fixed-size array. A fixed-size array is slightly more efficient because array indexing is faster than hash map lookups.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) Recounts occurrences for every character
Optimal O(n) O(1) Uses frequency counting with two passes

Algorithm Walkthrough

  1. Create a frequency array of size 26 initialized with zeros. Each position represents a lowercase English letter from 'a' to 'z'.
  2. Traverse the string once and count occurrences of each character. For a character c, compute its index using ord(c) - ord('a') in Python or c - 'a' in Go. Increment the corresponding frequency.
  3. Traverse the string a second time from left to right. This preserves the original order of characters.
  4. For each character, check its frequency in the array. If the frequency is exactly 1, return the current index immediately because this is the first non-repeating character.
  5. If the traversal finishes without finding a unique character, return -1.

The frequency array is chosen because the problem guarantees only lowercase English letters. This gives constant-size storage and very fast lookups.

Why it works

The algorithm works because the first pass computes the exact number of occurrences for every character in the string. During the second pass, characters are examined in their original order. Therefore, the first character whose frequency equals 1 must be the first non-repeating character in the string.

Python Solution

class Solution:
    def firstUniqChar(self, s: str) -> int:
        frequency = [0] * 26

        # Count character frequencies
        for char in s:
            index = ord(char) - ord('a')
            frequency[index] += 1

        # Find the first unique character
        for i, char in enumerate(s):
            index = ord(char) - ord('a')

            if frequency[index] == 1:
                return i

        return -1

The implementation follows the exact two-pass strategy described earlier.

The first loop builds the frequency array. Each character is mapped to an integer index between 0 and 25. For example, 'a' maps to 0, 'b' maps to 1, and so on.

The second loop scans the string in its original order. As soon as a character with frequency 1 is found, its index is returned immediately.

If no such character exists, the function returns -1 after the loop completes.

Using a fixed-size array instead of a dictionary keeps the implementation efficient and simple because the problem guarantees lowercase English letters only.

Go Solution

func firstUniqChar(s string) int {
    frequency := make([]int, 26)

    // Count character frequencies
    for _, char := range s {
        index := char - 'a'
        frequency[index]++
    }

    // Find the first unique character
    for i, char := range s {
        index := char - 'a'

        if frequency[index] == 1 {
            return i
        }
    }

    return -1
}

The Go solution is nearly identical to the Python version. A slice of length 26 stores character frequencies.

One Go-specific detail is that iterating over a string with range produces Unicode runes. Since the problem guarantees lowercase English letters, rune handling is completely safe here.

The solution does not need to worry about integer overflow because frequencies are at most 10^5, which easily fits within Go's integer type.

Worked Examples

Example 1

Input: s = "leetcode"

First Pass, Count Frequencies

Character Frequency After Update
l 1
e 1
e 2
t 1
c 1
o 1
d 1
e 3

Final frequencies:

Character Count
l 1
e 3
t 1
c 1
o 1
d 1

Second Pass

Index Character Frequency Action
0 l 1 Return 0

Output: 0

Example 2

Input: s = "loveleetcode"

Final Frequencies

Character Count
l 2
o 2
v 1
e 4
t 1
c 1
d 1

Second Pass

Index Character Frequency Action
0 l 2 Continue
1 o 2 Continue
2 v 1 Return 2

Output: 2

Example 3

Input: s = "aabb"

Final Frequencies

Character Count
a 2
b 2

Second Pass

Index Character Frequency Action
0 a 2 Continue
1 a 2 Continue
2 b 2 Continue
3 b 2 Continue

No unique character exists, so return -1.

Output: -1

Complexity Analysis

Measure Complexity Explanation
Time O(n) Two linear passes through the string
Space O(1) Frequency array size is fixed at 26

The algorithm performs two separate traversals of the string. Each traversal takes linear time relative to the string length n, resulting in overall O(n) complexity.

The space complexity is constant because the frequency array always contains exactly 26 elements regardless of input size.

Test Cases

solution = Solution()

assert solution.firstUniqChar("leetcode") == 0  # first character is unique
assert solution.firstUniqChar("loveleetcode") == 2  # unique character in middle
assert solution.firstUniqChar("aabb") == -1  # no unique characters

assert solution.firstUniqChar("z") == 0  # single character string
assert solution.firstUniqChar("aaab") == 3  # unique character at end
assert solution.firstUniqChar("aabcc") == 2  # unique character in center
assert solution.firstUniqChar("abcabcx") == 6  # last character unique
assert solution.firstUniqChar("ababcdcdx") == 8  # only final character unique
assert solution.firstUniqChar("aaaaaa") == -1  # all characters repeated
assert solution.firstUniqChar("abc") == 0  # all characters unique
assert solution.firstUniqChar("ddbbaac") == 6  # unique character appears last
Test Why
"leetcode" Validates standard example
"loveleetcode" Validates unique character in middle
"aabb" Validates no unique character case
"z" Validates single-character input
"aaab" Validates unique character at end
"aabcc" Validates unique character surrounded by duplicates
"abcabcx" Validates last character unique
"ababcdcdx" Tests many repeated pairs before unique
"aaaaaa" Validates all repeated characters
"abc" Validates all characters unique
"ddbbaac" Validates late unique character

Edge Cases

One important edge case is a string where every character repeats, such as "aabbcc". A careless implementation might incorrectly return the last examined index or fail to handle the absence of unique characters. This implementation correctly finishes the second traversal and returns -1.

Another important edge case is a string containing only one character, such as "x". Since the character appears exactly once, the answer should be 0. The frequency counting approach naturally handles this because the single character receives frequency 1.

A third important edge case occurs when the first unique character appears very late in the string, such as "aabbccd". Some implementations may incorrectly stop too early after encountering repeated characters. This solution continues scanning until it finds the first frequency equal to 1, ensuring the correct index is returned.

A final edge case involves strings where all characters are unique, such as "abcdef". The algorithm must return the first index, which is 0. Since the second pass checks characters in order, the first character immediately satisfies the uniqueness condition.