LeetCode 242 - Valid Anagram

The problem asks us to determine whether two strings are anagrams of each other. Two strings are considered anagrams if they contain exactly the same characters with exactly the same frequencies, but possibly in a different order.

LeetCode Problem 242

Difficulty: 🟢 Easy
Topics: Hash Table, String, Sorting

Solution

Problem Understanding

The problem asks us to determine whether two strings are anagrams of each other.

Two strings are considered anagrams if they contain exactly the same characters with exactly the same frequencies, but possibly in a different order. In other words, one string can be rearranged to form the other.

For example, "anagram" and "nagaram" are anagrams because both strings contain the same characters:

  • a appears 3 times
  • n appears 1 time
  • g appears 1 time
  • r appears 1 time
  • m appears 1 time

The order is different, but the character counts are identical.

On the other hand, "rat" and "car" are not anagrams because the characters differ. The first string contains r, a, and t, while the second contains c, a, and r.

The input consists of two strings, s and t, and the expected output is a boolean value:

  • Return true if t is an anagram of s
  • Return false otherwise

The constraints tell us that both strings can be as long as 5 × 10⁴ characters. This is important because it suggests we should avoid inefficient solutions. Algorithms with quadratic time complexity, such as repeatedly searching for characters, would be too slow in the worst case.

Another important detail is that the strings contain only lowercase English letters. Since there are only 26 possible characters, we can take advantage of this limitation to design a very efficient solution.

There are several edge cases worth considering upfront. If the strings have different lengths, they cannot possibly be anagrams because rearranging characters does not change length. Strings with repeated characters require careful frequency counting, because simply checking whether characters exist is insufficient. For example, "aacc" and "ccac" are not anagrams because the counts differ. The problem also guarantees that strings are non-empty and contain only lowercase English letters, which simplifies implementation.

Approaches

Brute Force Approach

A straightforward way to solve the problem is to sort both strings and compare the results.

The reasoning is simple. If two strings are anagrams, then after sorting their characters alphabetically, both strings will become identical. For example:

  • "anagram""aaagmnr"
  • "nagaram""aaagmnr"

Since the sorted versions match, the strings are anagrams.

This approach is correct because sorting removes the effect of character order and leaves only the multiset of characters. If the sorted strings differ, then at least one character or frequency must be different.

However, sorting is relatively expensive. Sorting a string of length n takes O(n log n) time. Given that strings can contain up to 50,000 characters, we can do better.

Optimal Approach, Character Frequency Counting

The key observation is that anagrams must contain the same character frequencies.

Instead of sorting, we can count how many times each character appears in both strings. If the counts match exactly, the strings are anagrams.

Because the problem guarantees lowercase English letters only, we can use a fixed-size array of length 26, where each index represents a letter:

  • Index 0'a'
  • Index 1'b'
  • ...
  • Index 25'z'

We iterate through both strings simultaneously:

  • Increment the count for characters in s
  • Decrement the count for characters in t

At the end, if all counts are zero, then every character frequency matched perfectly.

This approach is faster because it avoids sorting and only scans the strings once.

Approach Time Complexity Space Complexity Notes
Brute Force O(n log n) O(n) Sort both strings and compare
Optimal O(n) O(1) Count character frequencies using a fixed-size array

Algorithm Walkthrough

  1. First, check whether the lengths of s and t are equal.

This is an important optimization because strings with different lengths can never be anagrams. If the lengths differ, immediately return False. 2. Create a frequency array of size 26.

Since the input contains only lowercase English letters, a fixed-size array is sufficient. Each position stores the count difference between s and t. 3. Traverse both strings simultaneously.

For each index i:

  • Increment the counter for s[i]
  • Decrement the counter for t[i]

This effectively tracks the difference in character frequencies between the two strings. 4. Check the frequency array after traversal.

If every value in the array is zero, then the frequencies match exactly and the strings are anagrams. 5. Return the result.

  • If all counts are zero, return True
  • Otherwise, return False

Why it works

The algorithm maintains the invariant that the frequency array represents the difference in character counts between s and t.

Whenever a character appears in s, its count increases. Whenever the same character appears in t, its count decreases. If the strings are anagrams, every increment will eventually be canceled by a matching decrement, leaving all values equal to zero.

If even one value is non-zero, then some character appeared more times in one string than the other, meaning the strings are not anagrams.

Python Solution

class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        if len(s) != len(t):
            return False

        char_counts = [0] * 26

        for i in range(len(s)):
            char_counts[ord(s[i]) - ord('a')] += 1
            char_counts[ord(t[i]) - ord('a')] -= 1

        return all(count == 0 for count in char_counts)

The implementation begins with a length check. If the strings differ in size, we immediately return False because anagrams must contain the same total number of characters.

Next, we initialize an array called char_counts with 26 elements, all set to zero. Each index corresponds to a lowercase English letter.

We then iterate through both strings at the same time. For every character in s, we increment its corresponding counter. For every character in t, we decrement its corresponding counter. By doing both operations in the same loop, we efficiently compute frequency differences in one pass.

Finally, we verify that every counter is zero using Python's all() function. If every value is zero, the strings are anagrams. Otherwise, some character count differs, and we return False.

Go Solution

func isAnagram(s string, t string) bool {
    if len(s) != len(t) {
        return false
    }

    charCounts := [26]int{}

    for i := 0; i < len(s); i++ {
        charCounts[s[i]-'a']++
        charCounts[t[i]-'a']--
    }

    for _, count := range charCounts {
        if count != 0 {
            return false
        }
    }

    return true
}

The Go implementation follows the same logic as the Python version. Instead of a list, we use a fixed-size array of 26 integers.

Go strings are byte slices for ASCII characters, so indexing with s[i] and t[i] works efficiently here because the problem guarantees lowercase English letters. No Unicode handling is needed for this problem.

Unlike Python's all() function, Go requires an explicit loop to verify that all counts are zero.

There are no concerns about integer overflow because the maximum string length is only 50,000, which is well within the range of Go's integer type.

Worked Examples

Example 1

Input: s = "anagram", t = "nagaram"

We begin with:

char_counts = [0, 0, 0, ..., 0]
Step s[i] t[i] Operation Relevant Counts
1 a n a +1, n -1 a=1, n=-1
2 n a n +1, a -1 a=0, n=0
3 a g a +1, g -1 a=1, g=-1
4 g a g +1, a -1 a=0, g=0
5 r r r +1, r -1 r=0
6 a a a +1, a -1 a=0
7 m m m +1, m -1 m=0

Final state:

All counts = 0

Since every character frequency balances perfectly, the result is:

True

Example 2

Input: s = "rat", t = "car"

Initial state:

char_counts = [0, 0, 0, ..., 0]
Step s[i] t[i] Operation Relevant Counts
1 r c r +1, c -1 r=1, c=-1
2 a a a +1, a -1 a=0
3 t r t +1, r -1 t=1, r=0

Final state:

c = -1
t = 1

Since not all counts are zero, the strings are not anagrams.

Result:

False

Complexity Analysis

Measure Complexity Explanation
Time O(n) We traverse both strings once
Space O(1) The frequency array always has size 26

The time complexity is O(n) because we scan the strings exactly once and then check a fixed-size array of 26 elements. Since 26 is constant, that final verification step is effectively constant time.

The space complexity is O(1) because the frequency array size never changes regardless of input size. Even if the strings grow to 50,000 characters, we still only allocate 26 counters.

Test Cases

solution = Solution()

assert solution.isAnagram("anagram", "nagaram") is True  # Provided example, valid anagram
assert solution.isAnagram("rat", "car") is False  # Provided example, different characters

assert solution.isAnagram("a", "a") is True  # Minimum length, same character
assert solution.isAnagram("a", "b") is False  # Minimum length, different character

assert solution.isAnagram("listen", "silent") is True  # Standard anagram case
assert solution.isAnagram("evil", "vile") is True  # Rearranged characters

assert solution.isAnagram("aacc", "ccac") is False  # Same letters, wrong frequency
assert solution.isAnagram("abc", "abcc") is False  # Different lengths

assert solution.isAnagram("aaa", "aaa") is True  # Repeated same character
assert solution.isAnagram("aaa", "aab") is False  # One differing character

assert solution.isAnagram("abcdefghijklmnopqrstuvwxyz",
                          "zyxwvutsrqponmlkjihgfedcba") is True  # All letters reversed

assert solution.isAnagram("x" * 50000, "x" * 50000) is True  # Maximum constraint size
assert solution.isAnagram("x" * 49999 + "y",
                          "x" * 50000) is False  # Large input with one mismatch
Test Why
"anagram", "nagaram" Verifies the provided valid example
"rat", "car" Verifies the provided invalid example
"a", "a" Tests minimum input size
"a", "b" Tests smallest mismatch
"listen", "silent" Standard anagram validation
"evil", "vile" Tests reordered characters
"aacc", "ccac" Ensures character frequencies matter
"abc", "abcc" Verifies length mismatch optimization
"aaa", "aaa" Tests repeated identical characters
"aaa", "aab" Tests repeated character mismatch
Reversed alphabet strings Confirms order does not matter
Large identical strings Stress test at maximum constraint
Large mismatch Stress test with subtle frequency difference

Edge Cases

Different String Lengths

One of the most important edge cases occurs when the two strings have different lengths, such as "abc" and "abcc".

A naive implementation might still attempt to compare character counts unnecessarily, wasting computation or even causing indexing issues. Since anagrams must contain exactly the same characters, differing lengths immediately prove the answer is False.

The implementation handles this correctly with an early return:

if len(s) != len(t):
    return False

Repeated Characters With Different Frequencies

Another common source of bugs is handling repeated letters incorrectly. For example, "aacc" and "ccac" contain the same set of letters but not the same counts.

A simplistic solution that checks only character existence would incorrectly return True. Our frequency-counting method avoids this issue because every occurrence contributes to the total count. Any mismatch leaves a non-zero value in the frequency array.

Strings With Identical Characters

Cases like "aaa" and "aaa" may seem trivial, but they verify that repeated identical characters are handled correctly.

Some incorrect implementations accidentally overwrite counts rather than accumulating them. Our approach increments and decrements frequencies properly, so all repeated occurrences are tracked accurately.

Unicode Follow Up

The problem follow up asks how to adapt the solution for Unicode characters.

Since Unicode contains far more than 26 characters, a fixed-size array is no longer practical. Instead, we would use a hash map (dictionary in Python, map in Go) to count frequencies dynamically.

For example, in Python:

from collections import Counter

def isAnagram(s: str, t: str) -> bool:
    return Counter(s) == Counter(t)

This works because a hash map can efficiently store counts for any Unicode character, regardless of alphabet size. The time complexity remains O(n), although space complexity becomes O(k), where k is the number of distinct characters.