LeetCode 1002 - Find Common Characters

The problem gives us an array of lowercase strings called words. We must return all characters that appear in every string in the array, including duplicate occurrences. The important detail is that duplicates matter.

LeetCode Problem 1002

Difficulty: 🟢 Easy
Topics: Array, Hash Table, String

Solution

Problem Understanding

The problem gives us an array of lowercase strings called words. We must return all characters that appear in every string in the array, including duplicate occurrences.

The important detail is that duplicates matter. If a character appears multiple times in every word, then it must appear multiple times in the answer as well.

For example, consider:

["bella","label","roller"]

The character 'l' appears:

  • 2 times in "bella"
  • 2 times in "label"
  • 2 times in "roller"

Since every word contains at least two 'l' characters, the answer must include "l" twice.

However, the character 'b' does not appear in "roller", so it cannot be included.

The input constraints are small:

  • Up to 100 words
  • Each word length up to 100
  • Only lowercase English letters

Because there are only 26 possible lowercase letters, frequency counting becomes a very natural and efficient approach.

A few important observations and edge cases emerge immediately:

  • A character may appear many times in one word but only once in another. We can only include it the minimum number of times across all words.
  • Some words may contain no shared characters at all.
  • The array may contain only one word, in which case every character from that word belongs in the result.
  • Since only lowercase English letters are used, a fixed-size frequency array of length 26 is sufficient.

Approaches

Brute Force Approach

A brute force solution could try every character from the first word and check whether it exists in every other word. To correctly handle duplicates, we would also need to track how many times each character has already been consumed in each string.

One possible implementation would repeatedly:

  1. Pick a character from the first word.
  2. Search for that character in every other word.
  3. Remove or mark one occurrence from each word if found.
  4. Add the character to the answer.

This works because a character can only belong in the result if every word can contribute one unused occurrence of it.

However, this approach becomes inefficient because repeated searching and removal operations are expensive. String operations can degrade toward quadratic behavior, especially when repeatedly scanning words.

Optimal Approach

The key insight is that the number of times a character can appear in the final answer is determined by its minimum frequency across all words.

For example:

words = ["bella", "label", "roller"]

Character counts:

Character bella label roller Minimum
e 1 1 1 1
l 2 2 2 2
a 1 1 0 0

Only characters with minimum frequency greater than zero belong in the result.

Since there are only 26 lowercase letters, we can:

  1. Count frequencies in the first word.
  2. For every remaining word:
  • Count frequencies again.
  • Update the global frequency using the minimum count for each letter.
  1. Build the final answer using the remaining frequencies.

This is efficient because each word is processed once, and each frequency comparison only involves 26 letters.

Approach Time Complexity Space Complexity Notes
Brute Force O(n × m²) O(m) Repeated searching and removal operations
Optimal O(n × m) O(1) Uses frequency counting over 26 letters

Here:

  • n = number of words
  • m = average word length

Algorithm Walkthrough

  1. Create a frequency array of size 26 using the first word.

This array represents the current minimum frequency of each character seen across all processed words. Initially, the first word defines those frequencies. 2. Process each remaining word one by one.

For every word, create another frequency array of size 26 to count how many times each character appears in that specific word. 3. Update the global minimum frequencies.

For each character from 'a' to 'z', take:

global_frequency[i] = min(global_frequency[i], current_frequency[i])

This ensures the array always stores the minimum occurrence count across all words processed so far. 4. Build the result array.

Iterate through the final frequency array. If a character has frequency k, append that character k times to the answer. 5. Return the result.

Why it works

The algorithm maintains an invariant:

After processing each word, the global frequency array stores the minimum number of occurrences of every character across all processed words.

A character can only appear in the final answer as many times as the smallest count among all words. By repeatedly taking minimum frequencies, we guarantee the final result contains exactly the characters common to every string, including duplicates.

Python Solution

from typing import List

class Solution:
    def commonChars(self, words: List[str]) -> List[str]:
        # Initialize frequency array using the first word
        min_frequency = [0] * 26

        for char in words[0]:
            min_frequency[ord(char) - ord('a')] += 1

        # Process remaining words
        for word in words[1:]:
            current_frequency = [0] * 26

            for char in word:
                current_frequency[ord(char) - ord('a')] += 1

            # Keep minimum frequencies
            for i in range(26):
                min_frequency[i] = min(
                    min_frequency[i],
                    current_frequency[i]
                )

        # Build result
        result = []

        for i in range(26):
            result.extend(
                [chr(i + ord('a'))] * min_frequency[i]
            )

        return result

The implementation begins by creating a frequency array for the first word. This array acts as the running intersection of character counts across all words.

For each additional word, another frequency array is built. The algorithm then updates the global frequencies by taking the minimum value for every character. This step is the core of the solution because it guarantees that only universally shared occurrences survive.

Finally, the result list is constructed by expanding each character according to its remaining frequency. If the frequency of 'l' is 2, then "l" is added twice.

The implementation uses fixed-size arrays instead of hash maps because the problem guarantees only lowercase English letters. This improves both simplicity and efficiency.

Go Solution

func commonChars(words []string) []string {
    minFrequency := make([]int, 26)

    // Initialize using first word
    for _, ch := range words[0] {
        minFrequency[ch-'a']++
    }

    // Process remaining words
    for i := 1; i < len(words); i++ {
        currentFrequency := make([]int, 26)

        for _, ch := range words[i] {
            currentFrequency[ch-'a']++
        }

        // Keep minimum frequencies
        for j := 0; j < 26; j++ {
            if currentFrequency[j] < minFrequency[j] {
                minFrequency[j] = currentFrequency[j]
            }
        }
    }

    // Build result
    result := []string{}

    for i := 0; i < 26; i++ {
        for j := 0; j < minFrequency[i]; j++ {
            result = append(result, string(rune(i+'a')))
        }
    }

    return result
}

The Go implementation follows the same logic as the Python solution. Fixed-size integer slices are used for frequency counting.

One Go-specific detail is converting characters back into strings:

string(rune(i + 'a'))

This converts the integer offset into the correct character before appending it to the result slice.

Since the input constraints are small, there are no concerns about integer overflow.

Worked Examples

Example 1

words = ["bella","label","roller"]

Initial frequency from "bella":

Character Count
a 1
b 1
e 1
l 2

Current minimum frequencies:

a:1 b:1 e:1 l:2

Process "label":

Character Count
a 1
b 1
e 1
l 2

Take minimums:

Character Previous Current New Minimum
a 1 1 1
b 1 1 1
e 1 1 1
l 2 2 2

Process "roller":

Character Count
e 1
l 2
o 1
r 2

Update minimums:

Character Previous Current New Minimum
a 1 0 0
b 1 0 0
e 1 1 1
l 2 2 2

Final frequencies:

e:1 l:2

Result:

["e","l","l"]

Example 2

words = ["cool","lock","cook"]

Initial frequencies from "cool":

Character Count
c 1
l 1
o 2

Process "lock":

Character Count
c 1
l 1
o 1
k 1

Updated minimums:

Character Previous Current New Minimum
c 1 1 1
l 1 1 1
o 2 1 1

Process "cook":

Character Count
c 1
o 2
k 1

Updated minimums:

Character Previous Current New Minimum
c 1 1 1
l 1 0 0
o 1 2 1

Final frequencies:

c:1 o:1

Result:

["c","o"]

Complexity Analysis

Measure Complexity Explanation
Time O(n × m) Each character of each word is processed once
Space O(1) Frequency arrays always have fixed size 26

The time complexity comes from scanning every character in every word exactly once. Frequency comparisons only involve 26 letters, which is constant time.

The space complexity is constant because the algorithm only stores a few fixed-size arrays of length 26, regardless of input size.

Test Cases

from typing import List

class Solution:
    def commonChars(self, words: List[str]) -> List[str]:
        min_frequency = [0] * 26

        for char in words[0]:
            min_frequency[ord(char) - ord('a')] += 1

        for word in words[1:]:
            current_frequency = [0] * 26

            for char in word:
                current_frequency[ord(char) - ord('a')] += 1

            for i in range(26):
                min_frequency[i] = min(
                    min_frequency[i],
                    current_frequency[i]
                )

        result = []

        for i in range(26):
            result.extend(
                [chr(i + ord('a'))] * min_frequency[i]
            )

        return result

solution = Solution()

assert sorted(solution.commonChars(
    ["bella", "label", "roller"]
)) == ["e", "l", "l"]  # Provided example

assert sorted(solution.commonChars(
    ["cool", "lock", "cook"]
)) == ["c", "o"]  # Provided example

assert sorted(solution.commonChars(
    ["abc"]
)) == ["a", "b", "c"]  # Single word

assert sorted(solution.commonChars(
    ["a", "a", "a"]
)) == ["a"]  # Same single character

assert sorted(solution.commonChars(
    ["aaa", "aa", "aaaa"]
)) == ["a", "a"]  # Duplicate preservation

assert sorted(solution.commonChars(
    ["abc", "def", "ghi"]
)) == []  # No common characters

assert sorted(solution.commonChars(
    ["zzzz", "zz", "zzz"]
)) == ["z", "z"]  # Minimum frequency across words

assert sorted(solution.commonChars(
    ["abca", "bcaa", "caba"]
)) == ["a", "b", "c"]  # Mixed frequencies

assert sorted(solution.commonChars(
    ["leetcode", "coolcode", "decode"]
)) == ["c", "d", "e", "o"]  # Larger overlap case
Test Why
["bella","label","roller"] Validates duplicate handling
["cool","lock","cook"] Validates minimum frequency logic
["abc"] Tests single-word input
["a","a","a"] Tests simplest repeated character case
["aaa","aa","aaaa"] Ensures duplicates are preserved correctly
["abc","def","ghi"] Tests no common characters
["zzzz","zz","zzz"] Validates minimum count selection
["abca","bcaa","caba"] Tests mixed frequency interactions
["leetcode","coolcode","decode"] Tests more realistic overlapping strings

Edge Cases

Single Word Input

If the input contains only one word, every character in that word should appear in the result. A buggy implementation might incorrectly assume multiple comparisons are necessary.

This solution handles the case naturally because the initial frequency array is built from the first word, and no further updates occur.

No Common Characters

Inputs like:

["abc", "def", "ghi"]

contain no shared letters. Some implementations may accidentally keep stale frequencies from earlier words.

This implementation avoids that issue by continuously taking minimum frequencies. Any character absent from a word immediately becomes zero and can never reappear.

Duplicate Character Counts Differ

Consider:

["aaa", "aa", "aaaa"]

The correct answer is:

["a", "a"]

A common mistake is checking only whether a character exists, instead of tracking how many times it exists.

This implementation correctly stores exact frequencies and always keeps the minimum count across all words, which preserves duplicate behavior accurately.