LeetCode 3692 - Majority Frequency Characters

The problem asks us to analyze the frequency distribution of characters in a string and identify a particular group of characters based on how often they appear. We are given a string s containing only lowercase English letters.

LeetCode Problem 3692

Difficulty: 🟢 Easy
Topics: Hash Table, String, Counting

Solution

Problem Understanding

The problem asks us to analyze the frequency distribution of characters in a string and identify a particular group of characters based on how often they appear.

We are given a string s containing only lowercase English letters. For every distinct character, we first count how many times it appears. Characters with the same count belong to the same frequency group.

For example, if:

s = "aaabbbccdddde"

then the character frequencies are:

a → 3
b → 3
c → 2
d → 4
e → 1

From this, we build frequency groups:

1 → {e}
2 → {c}
3 → {a, b}
4 → {d}

The majority frequency group is the frequency group containing the largest number of distinct characters. In this example, frequency 3 contains two distinct characters (a and b), which is larger than all other groups, so the answer is "ab" in any order.

There is one additional rule that makes the problem slightly more interesting. If multiple frequency groups contain the same number of distinct characters, we choose the group with the larger frequency value k.

For example:

s = "pfpfgi"

gives:

2 → {p, f}
1 → {g, i}

Both groups contain two characters, so we break the tie by choosing the larger frequency, which is 2. Therefore, the answer is "pf".

The constraints are small:

1 <= s.length <= 100

Since the string length is at most 100, efficiency is not a major concern. Even moderately inefficient solutions would pass comfortably. However, we should still design a clean and logically efficient solution.

Several edge cases are important to recognize upfront. If the string contains only one character, that character must be returned. If every character appears exactly once, the entire string's set of distinct characters belongs to frequency group 1. Another subtle case occurs when two groups have equal size, requiring us to correctly apply the tie-breaking rule and choose the larger frequency.

Approaches

Brute Force Approach

A brute-force strategy would first count the frequency of every character. After that, for every possible frequency value from 1 to len(s), we could repeatedly scan all characters and collect those whose frequency matches the current value.

For each frequency k, we would determine the size of its group and keep track of the best candidate seen so far. If two groups have the same size, we would choose the larger frequency.

This approach is correct because it explicitly checks every possible frequency group and compares them according to the problem rules. However, it performs unnecessary repeated scans over the character set. Since the maximum string length is small, this inefficiency is harmless here, but conceptually it is wasteful.

Optimal Approach

The key observation is that once we know each character's frequency, we can immediately regroup characters by frequency.

Instead of repeatedly scanning all characters for each possible frequency, we create a second hash map:

frequency → list of characters

For example:

a → 3
b → 3
c → 2
d → 4
e → 1

becomes:

3 → [a, b]
2 → [c]
4 → [d]
1 → [e]

Now the problem becomes straightforward. We iterate through these frequency groups and select the one with:

  1. The largest number of characters.
  2. The larger frequency in case of ties.

Because each character is processed only once during regrouping and once during selection, this solution is both simple and efficient.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Repeatedly scans characters for every possible frequency
Optimal O(n) O(n) Uses two hash maps to group by frequency directly

Algorithm Walkthrough

  1. Create a hash map to count character frequencies.

Traverse the string once and count how many times each character appears. A hash map is ideal because it provides constant-time updates and lookups.

Example:

s = "aaabbbccdddde"

{
    'a': 3,
    'b': 3,
    'c': 2,
    'd': 4,
    'e': 1
}
  1. Create a second hash map that groups characters by frequency.

Iterate through the character-frequency map. For every (character, frequency) pair, append the character to the list corresponding to that frequency.

Result:

{
    3: ['a', 'b'],
    2: ['c'],
    4: ['d'],
    1: ['e']
}
  1. Track the best frequency group.

Maintain two variables:

  • best_frequency
  • best_size

Iterate through every frequency group.

If the current group contains more characters than the best one seen so far, update the answer.

If the sizes tie, compare frequencies and keep the larger frequency. 4. Return the characters in the winning group as a string.

Since the problem allows any order, we can simply join the stored character list.

Why it works

The algorithm works because every character belongs to exactly one frequency group. By regrouping characters according to frequency, we transform the problem into selecting the best group under a deterministic ordering:

  1. Prefer larger group size.
  2. If sizes tie, prefer larger frequency.

Since every group is examined exactly once and compared against the same criteria, the selected group must be the correct majority frequency group.

Python Solution

from collections import defaultdict
from typing import Dict, List

class Solution:
    def majorityFrequencyGroup(self, s: str) -> str:
        char_frequency: Dict[str, int] = {}

        for character in s:
            char_frequency[character] = (
                char_frequency.get(character, 0) + 1
            )

        frequency_groups: Dict[int, List[str]] = defaultdict(list)

        for character, frequency in char_frequency.items():
            frequency_groups[frequency].append(character)

        best_frequency = -1
        best_size = -1
        result_group: List[str] = []

        for frequency, characters in frequency_groups.items():
            current_size = len(characters)

            if (
                current_size > best_size
                or (
                    current_size == best_size
                    and frequency > best_frequency
                )
            ):
                best_size = current_size
                best_frequency = frequency
                result_group = characters

        return "".join(result_group)

The implementation follows the algorithm exactly.

The first loop constructs char_frequency, which records how many times each character appears. We use a dictionary because it allows constant-time counting.

The second loop builds frequency_groups, where the key is a frequency and the value is a list of characters appearing exactly that many times. defaultdict(list) simplifies insertion because we do not need to manually initialize missing lists.

After grouping, we iterate through all frequency groups and keep track of the current best candidate. The comparison logic directly encodes the problem rules. A larger group size always wins, and if sizes are equal, the larger frequency wins.

Finally, we join the winning character list into a string and return it. Since the problem accepts any order, there is no need to sort the result.

Go Solution

func majorityFrequencyGroup(s string) string {
	charFrequency := make(map[rune]int)

	for _, ch := range s {
		charFrequency[ch]++
	}

	frequencyGroups := make(map[int][]rune)

	for ch, frequency := range charFrequency {
		frequencyGroups[frequency] = append(
			frequencyGroups[frequency],
			ch,
		)
	}

	bestFrequency := -1
	bestSize := -1
	var resultGroup []rune

	for frequency, characters := range frequencyGroups {
		currentSize := len(characters)

		if currentSize > bestSize ||
			(currentSize == bestSize &&
				frequency > bestFrequency) {

			bestSize = currentSize
			bestFrequency = frequency
			resultGroup = characters
		}
	}

	return string(resultGroup)
}

The Go implementation closely mirrors the Python version. A map[rune]int stores character frequencies, and map[int][]rune groups characters by frequency.

Go strings are UTF-8 encoded, so iterating with range yields rune values. Although the problem guarantees lowercase English letters, using rune is still idiomatic and safe.

Unlike Python, Go does not provide a built-in defaultdict, but appending to a nil slice works automatically, making group construction straightforward.

There are no concerns about integer overflow because the maximum string length is only 100.

Worked Examples

Example 1

Input:

s = "aaabbbccdddde"

Step 1: Count frequencies

Character Frequency
a 3
b 3
c 2
d 4
e 1

Step 2: Build frequency groups

Frequency Characters Group Size
3 [a, b] 2
2 [c] 1
4 [d] 1
1 [e] 1

Step 3: Select best group

Frequency Group Size Current Best
3 2 Update to [a, b]
2 1 Ignore
4 1 Ignore
1 1 Ignore

Output:

"ab"

Example 2

Input:

s = "abcd"

Step 1: Count frequencies

Character Frequency
a 1
b 1
c 1
d 1

Step 2: Build frequency groups

Frequency Characters Group Size
1 [a, b, c, d] 4

Step 3: Select best group

Only one group exists, so it becomes the answer.

Output:

"abcd"

Example 3

Input:

s = "pfpfgi"

Step 1: Count frequencies

Character Frequency
p 2
f 2
g 1
i 1

Step 2: Build frequency groups

Frequency Characters Group Size
2 [p, f] 2
1 [g, i] 2

Step 3: Tie-breaking

Frequency Group Size Decision
2 2 Temporary winner
1 2 Same size, lower frequency, ignore

Output:

"pf"

Complexity Analysis

Measure Complexity Explanation
Time O(n) We traverse the string once and process each distinct character once
Space O(n) Hash maps store frequencies and groups

The time complexity is linear because every character is counted once, and every distinct character is grouped once. Since there are at most 26 lowercase letters, the practical runtime is effectively constant, but asymptotically it is still O(n).

The space complexity is O(n) in the general sense because auxiliary maps are used. Under the lowercase English constraint, the space usage is bounded by 26, making it effectively constant in practice.

Test Cases

solution = Solution()

# Provided examples
assert set(solution.majorityFrequencyGroup("aaabbbccdddde")) == {"a", "b"}  # majority group size > others
assert set(solution.majorityFrequencyGroup("abcd")) == {"a", "b", "c", "d"}  # all same frequency
assert set(solution.majorityFrequencyGroup("pfpfgi")) == {"p", "f"}  # tie broken by larger frequency

# Boundary cases
assert solution.majorityFrequencyGroup("a") == "a"  # single character input
assert solution.majorityFrequencyGroup("zzzzz") == "z"  # one repeated character

# Tie-breaking cases
assert set(solution.majorityFrequencyGroup("aabbccddeeffg")) == {
    "a", "b", "c", "d", "e", "f"
}  # frequency 2 beats frequency 1

assert set(solution.majorityFrequencyGroup("aabbccd")) == {
    "a", "b", "c"
}  # larger group wins

# Mixed frequencies
assert set(solution.majorityFrequencyGroup("aaaabbcccdd")) == {
    "c", "d"
}  # frequency 2 group largest

# All unique
assert set(solution.majorityFrequencyGroup("xyz")) == {
    "x", "y", "z"
}  # every character frequency 1

# Multiple frequency groups
assert set(solution.majorityFrequencyGroup("aabbbcccc")) == {
    "a", "b", "c"
}  # all groups size 1, choose highest frequency
Test Why
"aaabbbccdddde" Validates standard grouping
"abcd" Ensures all unique characters work
"pfpfgi" Validates tie-breaking by larger frequency
"a" Smallest valid input
"zzzzz" Single repeated character
"aabbccddeeffg" Confirms frequency tie resolution
"aabbccd" Confirms largest group wins
"aaaabbcccdd" Tests uneven distribution
"xyz" All characters unique
"aabbbcccc" Verifies single-member frequency tie

Edge Cases

One important edge case occurs when the input contains only a single character, such as "a". A naive implementation might incorrectly assume multiple groups exist or mishandle initialization variables. In our implementation, the only frequency group becomes the winner immediately, and the function correctly returns "a".

Another subtle case occurs when multiple groups contain the same number of characters. For example:

"pfpfgi"

Both frequency groups contain two characters. A buggy implementation might accidentally return the first group encountered rather than applying the tie-break rule. Our solution explicitly checks whether the frequency is larger when sizes are equal, ensuring correctness.

A third important case occurs when every character has the same frequency, such as "abcd" or "aabbcc". Here, only one frequency group exists, containing all distinct characters. The implementation naturally handles this because the grouping map contains exactly one entry, which is selected as the answer.

Finally, inputs with many distinct frequencies but only one character per group can be deceptive. For example:

"aabbbcccc"

Each frequency group contains exactly one character. Since the group sizes tie, the algorithm must select the group with the largest frequency. Our comparison logic guarantees that frequency 4 wins over 3 and 2.