Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Counting

Solution

LeetCode 3527 - Find the Most Common Response

Problem Understanding

This problem gives us a two dimensional array of strings called responses.

Each responses[i] represents all survey responses collected on the ith day. A response is simply a string such as "good", "ok", or "bad".

The important detail is that duplicate responses within the same day should only count once. Before calculating frequencies, we must remove duplicates from every individual day's response list.

For example:

["good", "ok", "good", "ok"]

becomes:

["good", "ok"]

because repeated occurrences within the same day do not contribute additional frequency.

After deduplicating each day independently, we count how many days each response appears in. The goal is to return the response with the highest frequency.

If multiple responses have the same maximum frequency, we return the lexicographically smallest one.

For example:

bad -> 2
good -> 2
ok -> 2

All three responses have the same frequency, so we choose "bad" because:

"bad" < "good" < "ok"

in lexicographical order.

Understanding the Constraints

The constraints are:

1 <= responses.length <= 1000
1 <= responses[i].length <= 1000
1 <= responses[i][j].length <= 10

In the worst case there can be:

1000 * 1000 = 1,000,000

individual response entries.

This immediately suggests that we need an algorithm that processes each response roughly once. An approach with quadratic behavior over all responses would be too expensive.

The short string length constraint, at most 10 characters, means string comparisons and hashing are relatively cheap.

Important Edge Cases

A few cases are worth identifying before designing the solution.

If a day contains many duplicates of the same response, they should only contribute one count. A naive frequency counter that counts every occurrence would produce incorrect results.

If there is only one unique response across all days, that response must be returned regardless of how many duplicates appear within individual days.

If several responses share the same maximum frequency, we must carefully apply the lexicographical tie-break rule.

The problem guarantees at least one response exists overall, so there will always be a valid answer.

Approaches

Brute Force

A straightforward approach is to first deduplicate every day's responses, then gather all unique response strings.

For every distinct response, we could scan every day and check whether that response appears in that day's deduplicated set.

If there are:

U unique responses
D days

then for each response we examine every day.

This works because we explicitly compute the number of days containing each response, but it performs unnecessary repeated scans.

When the number of unique responses becomes large, repeatedly traversing all days becomes inefficient.

Key Insight

Instead of counting one response at a time, we can process all responses simultaneously.

For each day:

Convert that day's responses into a set.
Iterate through the unique responses in that set.
Increment a global frequency counter for each response.

This guarantees that:

Duplicates within a day are ignored.
Every response contributes exactly one count per day in which it appears.

After building the frequency map, we simply find:

The maximum frequency.
The lexicographically smallest response among those with that frequency.

A hash map is ideal because it allows frequency updates in average O(1) time.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(U × D + N)	O(U)	Repeatedly scans all days for every unique response
Optimal	O(N)	O(U)	Single pass counting with per-day deduplication

Where:

N = total number of response entries across all days.
U = number of unique response strings overall.
D = number of days.

Algorithm Walkthrough

Create a hash map called frequency that maps each response string to the number of days in which it appears.
Process each day's response list independently.
Convert the day's response list into a set. This removes duplicates within that day, ensuring each response contributes at most one count for that day.
Iterate through every response in the set and increment its count in the global frequency map.
After all days have been processed, iterate through the frequency map to determine the best answer.
Keep track of:

The highest frequency seen so far.
The corresponding response string.

When examining a response:

If its frequency is greater than the current maximum, make it the new answer.
If its frequency equals the current maximum, compare the strings lexicographically and keep the smaller one.

Return the final answer string.

Why it works

The set conversion guarantees that each response contributes at most one count per day, exactly matching the problem requirement. The frequency map therefore records the number of deduplicated daily lists containing each response. Selecting the response with maximum frequency produces the most common response, and the lexicographical comparison correctly resolves ties. Since every response is counted accurately and every candidate is examined, the algorithm always returns the correct answer.

Problem Understanding

We are given a two-dimensional array responses, where each responses[i] represents all survey responses collected on day i. Each element responses[i][j] is a lowercase string.

The task is to compute a global frequency of responses, but with a crucial preprocessing step: within each individual day responses[i], duplicate responses must be removed before counting. In other words, each day contributes at most one count per distinct string appearing that day, regardless of how many times it repeats within that day.

After aggregating these deduplicated daily contributions across all days, we must return the response string with maximum frequency. If multiple strings share the same maximum frequency, we return the lexicographically smallest one.

The constraints imply up to $1000$ days and $1000$ responses per day, yielding up to $10^6$ raw entries. Each string length is at most 10, so hashing and comparisons are inexpensive. This strongly suggests an $O(n)$-type solution over all entries is necessary.

The key edge cases arise from repeated values within a single day, ties in global frequency, and minimal input sizes where only one string exists overall.

Approaches

The central difficulty is the requirement that duplicates within each day do not contribute multiple times. A naive interpretation might incorrectly count all occurrences directly.

Brute-force approach

A direct method would iterate over every day and every string, and for each string check whether it has already been seen earlier in the same day using a linear scan of a temporary list. If not seen, we increment its global count. This leads to repeated membership checks that cost $O(k)$ per insertion for a day of size $k$, resulting in quadratic behavior per day in the worst case.

This is correct because it enforces per-day uniqueness manually, but it is unnecessarily expensive given that set membership checks can be done in expected $O(1)$.

Optimal insight

The key observation is that the problem decomposes cleanly into two independent stages:

First, enforce per-day uniqueness using a set, since sets model mathematical distinctness exactly.

Second, aggregate counts globally using a hash map (dictionary), since we require frequency accumulation across independent groups.

This yields a total complexity linear in the number of raw entries.

Complexity comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	$O(\sum_i n_i^2)$	$O(1)$ extra (besides output)	Repeated linear membership checks per day
Optimal	$O(N)$	$O(U)$	Use set per day + hash map over all unique responses

Here $N$ is total number of raw responses, and $U$ is number of distinct responses globally.

Algorithm Walkthrough

We construct the solution in two conceptual phases: normalization and aggregation.

For each day $i$, we construct a set $S_i$ containing all distinct strings in responses[i]. The justification is that set construction ensures idempotence: repeated occurrences collapse into a single representative element.
For each string $s \in S_i$, we increment a global hash map freq[s] by 1. This models the fact that each day contributes at most one vote per unique string.
After processing all days, we must determine the maximum frequency value $M = \max_s freq[s]$.
We then select the lexicographically smallest string among all $s$ such that $freq[s] = M$. This requires a final linear scan over the dictionary.
The result is returned as that chosen string.

The hash map is the correct structure because it supports amortized $O(1)$ insertion and lookup, and we require associative aggregation keyed by string identity. The per-day set is required to enforce the constraint that duplicates within a day are collapsed before aggregation.

Why it works

The algorithm maintains the invariant that after processing day $i$, freq[s] equals the number of distinct days among $[0, i]$ in which $s$ appears at least once. This invariant matches the problem specification exactly, since each day contributes at most one occurrence per string. Therefore, maximizing this count and resolving ties lexicographically yields the correct result.

Python Solution

from typing import List
from collections import defaultdict

class Solution:
    def findCommonResponse(self, responses: List[List[str]]) -> str:
        frequency = defaultdict(int)

        for day_responses in responses:
            unique_responses = set(day_responses)

            for response in unique_responses:
                frequency[response] += 1

        best_response = ""
        best_frequency = -1

        for response, count in frequency.items():
            if count > best_frequency:
                best_frequency = count
                best_response = response
            elif count == best_frequency and response < best_response:
                best_response = response

        return best_response

The implementation begins by creating a hash map called frequency. A defaultdict(int) is convenient because missing keys automatically start at zero.

For every day, the code converts the response list into a set. This is the critical step that removes duplicates within that day.

The unique responses are then added to the global frequency map. Each response receives exactly one increment for that day.

After all counts have been collected, the code scans the frequency map to find the best answer. The variables best_frequency and best_response track the current optimum. A higher frequency always wins. If frequencies are equal, lexicographical ordering determines the winner.

Finally, the selected response is returned. freq = defaultdict(int)

    for day in responses:
        seen = set(day)
        for resp in seen:
            freq[resp] += 1

    max_freq = 0
    answer = ""

    for resp, count in freq.items():
        if count > max_freq or (count == max_freq and resp < answer):
            max_freq = count
            answer = resp

    return answer


The implementation follows the algorithm directly. The dictionary `freq` stores global counts of distinct-day occurrences. The `seen` set removes intra-day duplicates. The final loop performs a single pass reduction to compute both the maximum frequency and lexicographic tie-breaking simultaneously.

## Go Solution

```go
func findCommonResponse(responses [][]string) string {
	frequency := make(map[string]int)

	for _, dayResponses := range responses {
		uniqueResponses := make(map[string]struct{})

		for _, response := range dayResponses {
			uniqueResponses[response] = struct{}{}
		}

		for response := range uniqueResponses {
			frequency[response]++
		}
	}

	bestResponse := ""
	bestFrequency := -1

	for response, count := range frequency {
		if count > bestFrequency {
			bestFrequency = count
			bestResponse = response
		} else if count == bestFrequency && response < bestResponse {
			bestResponse = response
		}
	}

	return bestResponse
}

The Go solution follows the same logic as the Python version.

Since Go does not have a built in set type, a map from string to empty struct is used to represent a set. The empty struct consumes zero storage per value and is the standard Go idiom for set implementations.

Frequency counting uses a map[string]int. Lexicographical comparison between strings is directly supported with the < operator.

No overflow concerns exist because the maximum count is at most the number of days, which is only 1000. freq := make(map[string]int)

for _, day := range responses {
	seen := make(map[string]struct{})
	for _, resp := range day {
		seen[resp] = struct{}{}
	}
	for resp := range seen {
		freq[resp]++
	}
}

maxFreq := 0
answer := ""

for resp, count := range freq {
	if count > maxFreq || (count == maxFreq && (answer == "" || resp < answer)) {
		maxFreq = count
		answer = resp
	}
}

return answer

}


In Go, we use `map[string]struct{}` to represent a set, since Go lacks a built-in set type. The logic is identical to the Python version. The only subtlety is initializing `answer` as an empty string and carefully handling lexicographic comparison when `answer` is still unset.

## Worked Examples

### Example 1

Input:

[ ["good","ok","good","ok"], ["ok","bad","good","ok","ok"], ["good"], ["bad"] ]


#### Day 1

Unique responses:

{"good", "ok"}


Frequency map:

| Response | Count |
| --- | --- |
| good | 1 |
| ok | 1 |

#### Day 2

Unique responses:

{"ok", "bad", "good"}


Frequency map:

| Response | Count |
| --- | --- |
| good | 2 |
| ok | 2 |
| bad | 1 |

#### Day 3

Unique responses:

{"good"}


Frequency map:

| Response | Count |
| --- | --- |
| good | 3 |
| ok | 2 |
| bad | 1 |

#### Day 4

Unique responses:

{"bad"}


Frequency map:

| Response | Count |
| --- | --- |
| good | 3 |
| ok | 2 |
| bad | 2 |

Maximum frequency:

good -> 3


Answer:

"good"


### Example 2

Input:

[ ["good","ok","good"], ["ok","bad"], ["bad","notsure"], ["great","good"] ]


#### Day 1

| Response | Count |
| --- | --- |
| good | 1 |
| ok | 1 |

#### Day 2

| Response | Count |
| --- | --- |
| good | 1 |
| ok | 2 |
| bad | 1 |

#### Day 3

| Response | Count |
| --- | --- |
| good | 1 |
| ok | 2 |
| bad | 2 |
| notsure | 1 |

#### Day 4

| Response | Count |
`[["good","ok","good","ok"],["ok","bad","good","ok","ok"],["good"],["bad"]]`

After per-day deduplication:

Day 0 → {"good", "ok"}

Day 1 → {"ok", "bad", "good"}

Day 2 → {"good"}

Day 3 → {"bad"}

Now we build frequency:

| Response | Count accumulation |
| --- | --- |
| good | Day 0, Day 1, Day 2 → 3 |
| ok | Day 0, Day 1 → 2 |
| bad | Day 1, Day 3 → 2 |

Maximum frequency is 3, so answer is `"good"`.

### Example 2

After deduplication:

Day 0 → {"good", "ok"}

Day 1 → {"ok", "bad"}

Day 2 → {"bad", "notsure"}

Day 3 → {"great", "good"}

Counts:

| Response | Days |
| --- | --- |
| good | 2 |
| ok | 2 |
| bad | 2 |
| notsure | 1 |
| great | 1 |

The highest frequency is:


Responses with frequency 2:

bad good ok


Lexicographically:

bad < good < ok


Answer:

"bad"

Tie among `"bad"`, `"good"`, `"ok"` with frequency 2. Lexicographically smallest is `"bad"`.

## Complexity Analysis

| Measure | Complexity | Explanation |
| --- | --- | --- |
| Time | O(N) | Every response is processed once while building daily sets and counts |
| Space | O(U) | Stores the frequency map and temporary sets of unique responses |

Here, `N` is the total number of response entries across all days, and `U` is the total number of distinct response strings.

Although a set is created for each day, the total amount of work across all days remains proportional to the total number of input responses. Hash table operations are average `O(1)`, giving overall linear time complexity.

## Test Cases

```python
from typing import List

s = Solution()

assert s.findCommonResponse([
    ["good","ok","good","ok"],
    ["ok","bad","good","ok","ok"],
    ["good"],
    ["bad"]
]) == "good"  # Example 1

assert s.findCommonResponse([
    ["good","ok","good"],
    ["ok","bad"],
    ["bad","notsure"],
    ["great","good"]
]) == "bad"  # Example 2, lexicographical tie

assert s.findCommonResponse([
    ["good"]
]) == "good"  # Single day, single response

assert s.findCommonResponse([
    ["good","good","good","good"]
]) == "good"  # All duplicates in one day

assert s.findCommonResponse([
    ["a"],
    ["b"],
    ["c"]
]) == "a"  # All frequencies equal

assert s.findCommonResponse([
    ["x","x","x"],
    ["x"],
    ["x","x"]
]) == "x"  # Same response across all days

assert s.findCommonResponse([
    ["apple","banana"],
    ["banana","cherry"],
    ["apple","cherry"]
]) == "apple"  # Three-way tie

assert s.findCommonResponse([
    ["z"],
    ["a"],
    ["z"],
    ["a"]
]) == "a"  # Tie resolved lexicographically

assert s.findCommonResponse([
    ["cat","dog","cat"],
    ["dog","dog","dog"],
    ["cat"]
]) == "cat"  # Duplicate removal affects counting

Test Summary

Test	Why
Example 1	Validates standard counting behavior
Example 2	Validates lexicographical tie breaking
Single response	Smallest valid input
All duplicates in one day	Ensures per-day deduplication
All frequencies equal	Tests tie handling
Same response everywhere	Tests dominant response case
Three-way tie	Verifies correct lexicographical selection
Two-way tie	Additional tie-breaking validation
Duplicate-heavy input	Confirms duplicates do not inflate counts

Edge Cases

Multiple Duplicates Within a Single Day

A common mistake is counting every occurrence of a response. Consider:

["good", "good", "good"]

The problem requires this to contribute only one occurrence for "good" on that day. The implementation handles this by converting each day's responses into a set before updating frequencies.

All Responses Have the Same Frequency

Consider:

[
    ["a"],
    ["b"],
    ["c"]
]

Every response appears exactly once. The correct answer is "a" because it is lexicographically smallest. The implementation explicitly compares strings whenever frequencies are tied.

Large Numbers of Repeated Responses

Consider:

[
    ["x"] * 1000,
    ["x"] * 1000,
    ["x"] * 1000
]

A buggy solution might count 3000 occurrences. The correct answer is that "x" appears on three days. Using a set per day ensures each day contributes exactly one count regardless of how many duplicates exist.

Only One Distinct Response Overall

Consider:

[
    ["good", "good"],
    ["good"],
    ["good", "good", "good"]
]

There is only one unique response in the entire dataset. The frequency map will contain a single key, and the algorithm correctly returns "good" without requiring any special handling. | Time | $O(N)$ | Each response is inserted into a set once per day and counted once globally | | Space | $O(U)$ | Hash map stores one entry per distinct response; per-day set is transient |

The linearity follows because each string is processed a constant number of times: once for insertion into a set and once for aggregation into the global frequency table.

Test Cases

assert Solution().findCommonResponse([["a","a","a"]]) == "a"  # single day with duplicates
assert Solution().findCommonResponse([["a"],["b"]]) == "a"  # tie broken lexicographically
assert Solution().findCommonResponse([["c","b"],["a"]]) == "a"  # all equal frequency
assert Solution().findCommonResponse([["x","y","x"],["y","y","z"]]) == "y"  # mixed duplicates
assert Solution().findCommonResponse([["good"],["good"],["bad"]]) == "good"  # frequency dominance
assert Solution().findCommonResponse([["aa","ab"],["ab","aa"]]) == "aa"  # lexicographic tie

Test	Why
single repeated day	validates intra-day deduplication
two singletons	tests lexicographic tie-break
equal frequencies	ensures correct tie resolution
mixed duplicates	stresses both set and map logic
dominance case	verifies correct max tracking
close lexicographic strings	ensures string comparison correctness

Edge Cases

One important edge case is when all entries in a day are identical. Without deduplication, this would incorrectly inflate frequency counts. The use of a set guarantees that such repetition collapses into a single contribution.

Another edge case occurs when multiple responses share the same maximum frequency. This tests whether the algorithm correctly applies lexicographic ordering after frequency maximization. The final selection loop explicitly checks both count and string order, ensuring correctness.

A third edge case is the minimal input scenario where there is only one response string across all days. In this case, the algorithm correctly initializes max_freq to zero and updates it upon first encounter, returning that single string without requiring special-case logic.