LeetCode 2068 - Check Whether Two Strings are Almost Equivalent

The problem asks us to determine whether two strings are "almost equivalent" based on the frequency of each lowercase English letter. We are given two strings, word1 and word2, both of the same length.

LeetCode Problem 2068

Difficulty: 🟢 Easy
Topics: Hash Table, String, Counting

Solution

Problem Understanding

The problem asks us to determine whether two strings are "almost equivalent" based on the frequency of each lowercase English letter.

We are given two strings, word1 and word2, both of the same length. For every character from 'a' to 'z', we compare how many times that character appears in each string. If the absolute difference in frequency for every letter is at most 3, the strings are considered almost equivalent. Otherwise, they are not.

For example, suppose one string contains four occurrences of 'a' while the other contains zero occurrences of 'a'. The difference is 4, which exceeds the allowed limit of 3, so the answer would be false.

The input constraints are small:

  • Each string length is between 1 and 100
  • Only lowercase English letters are used

These constraints tell us several important things:

  • The alphabet size is fixed at 26 letters
  • We do not need sophisticated optimizations
  • Frequency counting is the natural solution
  • Even an inefficient approach would likely pass, but we still want the cleanest and most scalable implementation

An important detail is that we must check all letters from 'a' to 'z', even letters that do not appear in one or both strings. Missing letters still contribute a frequency difference. For example:

  • "aaaa" vs "bccb"
  • Frequency of 'a' is 4 vs 0
  • Difference is 4
  • Result is false

Some edge cases worth considering are:

  • Strings of length 1
  • Completely identical strings
  • Strings where a letter appears only in one string
  • Differences exactly equal to 3, which should still return true
  • Differences greater than 3, which should return false

Because the alphabet is fixed and small, frequency arrays are an ideal fit.

Approaches

Brute Force Approach

A brute force solution would repeatedly count character frequencies for each letter separately.

For every character from 'a' to 'z', we could scan both strings and count how many times that character appears. Then we compare the counts. If any difference exceeds 3, we return false.

This works because it directly follows the problem definition. Every character frequency is checked independently.

However, this approach is inefficient because it repeatedly scans the strings. If the string length is n, and we scan both strings for each of the 26 letters, the complexity becomes O(26 × n). While still acceptable for small constraints, it performs unnecessary repeated work.

Optimal Approach

The key observation is that we only need the frequency of each letter once.

Instead of repeatedly scanning the strings, we can build frequency counts in a single pass. Since there are only 26 lowercase English letters, we can use two arrays of size 26.

We iterate through both strings once:

  • Increment the count for each character in word1
  • Increment the count for each character in word2

After building the frequency arrays, we compare the counts for each letter. If any absolute difference exceeds 3, we return false. Otherwise, we return true.

This solution is efficient, simple, and directly matches the problem requirements.

Approach Time Complexity Space Complexity Notes
Brute Force O(26 × n) O(1) Repeatedly scans strings for each letter
Optimal O(n + 26) O(1) Uses frequency counting with fixed-size arrays

Algorithm Walkthrough

  1. Create two arrays of size 26, initialized with zeros.

These arrays store character frequencies for word1 and word2. Index 0 represents 'a', index 1 represents 'b', and so on. 2. Traverse both strings once.

For every character in word1, compute its array index using:

ord(char) - ord('a')

Increment the corresponding frequency count.

Do the same for word2. 3. Compare frequencies for all 26 letters.

For every index from 0 to 25, compute the absolute difference between the two frequency arrays. 4. If any difference exceeds 3, immediately return false.

This means the strings are not almost equivalent. 5. If all differences are at most 3, return true.

This confirms the strings satisfy the problem condition.

Why it works

The algorithm works because it explicitly computes the exact frequency of every lowercase letter in both strings. Since the definition of almost equivalent depends entirely on these frequency differences, checking all 26 letters guarantees correctness. If every difference is at most 3, the strings satisfy the condition. If even one letter violates the limit, the strings are not almost equivalent.

Python Solution

class Solution:
    def checkAlmostEquivalent(self, word1: str, word2: str) -> bool:
        frequency1 = [0] * 26
        frequency2 = [0] * 26

        for char in word1:
            index = ord(char) - ord('a')
            frequency1[index] += 1

        for char in word2:
            index = ord(char) - ord('a')
            frequency2[index] += 1

        for i in range(26):
            if abs(frequency1[i] - frequency2[i]) > 3:
                return False

        return True

The implementation starts by creating two frequency arrays of size 26. Since the input contains only lowercase English letters, each letter maps directly to an index.

The first loop counts character occurrences in word1. The second loop does the same for word2.

After both frequency arrays are built, the final loop compares frequencies letter by letter. The abs() function computes the absolute difference between counts.

If any difference exceeds 3, the function immediately returns False. Otherwise, after checking all letters, it returns True.

This implementation exactly follows the algorithm described earlier and efficiently solves the problem in linear time.

Go Solution

func checkAlmostEquivalent(word1 string, word2 string) bool {
    frequency1 := make([]int, 26)
    frequency2 := make([]int, 26)

    for _, char := range word1 {
        index := char - 'a'
        frequency1[index]++
    }

    for _, char := range word2 {
        index := char - 'a'
        frequency2[index]++
    }

    for i := 0; i < 26; i++ {
        difference := frequency1[i] - frequency2[i]

        if difference < 0 {
            difference = -difference
        }

        if difference > 3 {
            return false
        }
    }

    return true
}

The Go solution follows the same logic as the Python version.

Instead of Python lists, Go uses integer slices created with make([]int, 26).

Go does not provide a built in absolute value function for integers in the standard library without conversion, so the code manually converts negative differences into positive values.

Since the alphabet size is fixed, the memory usage remains constant.

Worked Examples

Example 1

Input:
word1 = "aaaa"
word2 = "bccb"

Step 1: Build Frequency Arrays

Letter word1 Count word2 Count
a 4 0
b 0 2
c 0 2

All other letters have count 0.

Step 2: Compare Differences

Letter Difference
a 4
b 2
c 2

The difference for 'a' is 4, which is greater than 3.

Result: false

Example 2

Input:
word1 = "abcdeef"
word2 = "abaaacc"

Step 1: Build Frequency Arrays

Letter word1 Count word2 Count
a 1 4
b 1 1
c 1 2
d 1 0
e 2 0
f 1 0

Step 2: Compare Differences

Letter Difference
a 3
b 0
c 1
d 1
e 2
f 1

Every difference is at most 3.

Result: true

Example 3

Input:
word1 = "cccddabba"
word2 = "babababab"

Step 1: Build Frequency Arrays

Letter word1 Count word2 Count
a 2 4
b 3 5
c 3 0
d 2 0

Step 2: Compare Differences

Letter Difference
a 2
b 2
c 3
d 2

All differences are within the allowed range.

Result: true

Complexity Analysis

Measure Complexity Explanation
Time O(n + 26) One pass to count frequencies and one pass over 26 letters
Space O(1) Frequency arrays have fixed size 26

The runtime is linear in the length of the strings because we process each character once. The additional 26 operations for comparing frequencies are constant time.

The space complexity is constant because the frequency arrays always contain exactly 26 elements, regardless of input size.

Test Cases

solution = Solution()

assert solution.checkAlmostEquivalent("aaaa", "bccb") == False
# Difference of 4 for 'a'

assert solution.checkAlmostEquivalent("abcdeef", "abaaacc") == True
# Differences are all at most 3

assert solution.checkAlmostEquivalent("cccddabba", "babababab") == True
# Multiple letters differ, but all within limit

assert solution.checkAlmostEquivalent("a", "a") == True
# Smallest valid identical strings

assert solution.checkAlmostEquivalent("a", "b") == True
# Difference is 1 for both letters

assert solution.checkAlmostEquivalent("aaaa", "aaaa") == True
# Completely identical strings

assert solution.checkAlmostEquivalent("zzzz", "z") == True
# Difference exactly 3

assert solution.checkAlmostEquivalent("zzzzz", "z") == False
# Difference exceeds 3

assert solution.checkAlmostEquivalent("abc", "xyz") == True
# Each letter differs by only 1

assert solution.checkAlmostEquivalent("bbbbbbbb", "bbbb") == False
# Difference is 4 for 'b'
Test Why
"aaaa" vs "bccb" Validates rejection when difference exceeds 3
"abcdeef" vs "abaaacc" Confirms normal valid case
"cccddabba" vs "babababab" Tests multiple character comparisons
"a" vs "a" Smallest identical input
"a" vs "b" Smallest differing input
"aaaa" vs "aaaa" Identical larger strings
"zzzz" vs "z" Difference exactly equal to 3
"zzzzz" vs "z" Difference greater than 3
"abc" vs "xyz" Completely different letters
"bbbbbbbb" vs "bbbb" Large frequency imbalance

Edge Cases

One important edge case is when the frequency difference is exactly 3. The problem statement says the difference must be "at most 3", so a difference of exactly 3 is valid. A common bug is accidentally using >= 3 instead of > 3. The implementation correctly checks only for differences greater than 3.

Another important edge case is when a character appears in only one string. For example, "aaaa" and "bbbb" contain completely different characters. Missing characters must still be treated as having frequency 0. The frequency array approach naturally handles this because every letter has an entry initialized to zero.

A third edge case is very small inputs such as strings of length 1. These cases verify that the algorithm does not rely on longer strings or multiple occurrences. Since the solution simply counts frequencies and compares them, it works correctly even for the smallest valid inputs.