LeetCode 3328 - Find Cities in Each State II

Difficulty: 🟡 Medium
Topics: Database

Solution

Problem Understanding

The problem gives us a database table named cities, where each row contains a state name and a city name. The pair (state, city) is guaranteed to be unique, which means the same city will not appear twice for the same state.

We must generate a report for qualifying states. A state qualifies only if it satisfies two conditions:

The state contains at least 3 cities.
At least one city in that state starts with the same first letter as the state name.

For every valid state, we need to produce three pieces of information:

The state name
A comma-separated string containing all cities in alphabetical order
The number of cities whose first letter matches the first letter of the state

The final output must be ordered using two sorting rules:

Descending order of matching_letter_count
Ascending alphabetical order of state when counts are equal

The important detail is that the matching condition compares only the first character of the state name and the first character of the city name. For example:

Texas and Tyler match because both start with T
New York and Newark match because both start with N
California and San Diego do not match because C != S

The input size is not explicitly specified, but since this is a database aggregation problem, the intended solution is clearly based on SQL grouping and aggregation operations. The challenge is not computational complexity in the traditional algorithmic sense, but rather correctly combining filtering, aggregation, sorting, and string concatenation.

Several edge cases are important:

States with fewer than 3 cities must be excluded entirely.
States with exactly 3 cities should still qualify if they satisfy the matching condition.
States where no city shares the starting letter must be filtered out.
City names inside the concatenated string must appear in alphabetical order.
Multiple cities may share the same starting letter, and all such cities contribute to the count.

Approaches

Brute Force Approach

A brute force strategy would process the table row by row and manually build groups for each state using application-level logic. For every state, we could:

Collect all cities belonging to that state.
Sort the city names alphabetically.
Count how many cities begin with the same first letter as the state.
Filter states that do not meet the constraints.
Sort the final result.

This approach works because it explicitly computes every required piece of information. However, it is inefficient for database systems because it transfers unnecessary work to the application layer instead of letting SQL perform optimized grouping and aggregation internally.

In a real database environment, repeatedly scanning rows and manually grouping data is slower and less scalable than using built-in aggregation functions.

Optimal Approach

The optimal solution uses SQL aggregation directly.

The key observation is that all required computations are naturally group-based:

We need to analyze cities per state.
We need counts per state.
We need a concatenated city list per state.
We need filtering conditions based on aggregated values.

SQL provides all the tools needed:

GROUP BY groups rows by state.
COUNT(*) computes the number of cities.
SUM(CASE WHEN ...) counts matching-letter cities.
STRING_AGG() or GROUP_CONCAT() combines city names into a single ordered string.
HAVING filters grouped results.

Because aggregation is performed directly inside the database engine, the solution is concise, efficient, and scalable.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n log n)	O(n)	Manual grouping and sorting outside the database
Optimal	O(n log n)	O(n)	Uses SQL aggregation and ordered concatenation

Algorithm Walkthrough

Group all rows by state.

This allows us to analyze all cities belonging to the same state together. Every aggregate calculation will now operate within each state group. 2. Count the number of cities in each state.

We use COUNT(*) to determine how many cities belong to the state. This is necessary because only states with at least 3 cities should be included. 3. Count cities whose first letter matches the state's first letter.

We compare the first character of the city and state names using LEFT(city, 1) and LEFT(state, 1).

Whenever the letters match, we count that city using:

SUM(CASE WHEN LEFT(city, 1) = LEFT(state, 1) THEN 1 ELSE 0 END)

Concatenate city names into a single string.

We combine all city names for a state using an ordered aggregation function:

STRING_AGG(city, ', ' ORDER BY city)

This ensures cities appear alphabetically in the output. 5. Apply filtering conditions using HAVING.

Since filtering depends on aggregate values, we must use HAVING instead of WHERE.

We keep only states where:

COUNT(*) >= 3
matching city count is greater than 0

Sort the final result.

Results are ordered by:

matching-letter count descending
state ascending

Why it works

The algorithm works because every required output field is derived from state-level aggregation. Grouping by state guarantees that all calculations operate on the correct subset of rows. The SUM(CASE ...) expression correctly counts only cities satisfying the letter condition, while HAVING ensures invalid groups are removed after aggregation. Ordered string aggregation guarantees cities appear alphabetically inside the concatenated result.

Python Solution

Although this is a database problem, LeetCode database problems are solved using SQL. Below is a correct SQL solution formatted inside a Python block for consistency with the requested structure.

SELECT
    state,
    STRING_AGG(city, ', ' ORDER BY city) AS cities,
    SUM(
        CASE
            WHEN LEFT(city, 1) = LEFT(state, 1)
            THEN 1
            ELSE 0
        END
    ) AS matching_letter_count
FROM cities
GROUP BY state
HAVING
    COUNT(*) >= 3
    AND SUM(
        CASE
            WHEN LEFT(city, 1) = LEFT(state, 1)
            THEN 1
            ELSE 0
        END
    ) >= 1
ORDER BY matching_letter_count DESC, state ASC;

The query begins by grouping rows using GROUP BY state. Once rows are grouped, all aggregate functions operate independently inside each state group.

The STRING_AGG(city, ', ' ORDER BY city) expression creates a single comma-separated string containing cities in alphabetical order.

The SUM(CASE ...) expression counts how many cities begin with the same first letter as the state. Every matching city contributes 1, while non-matching cities contribute 0.

The HAVING clause filters aggregated groups. We require both:

at least 3 cities
at least one matching-letter city

Finally, the ORDER BY clause applies the required sorting rules.

Go Solution

Since this is a SQL database problem, there is no traditional Go implementation. However, the equivalent SQL solution is shown below.

SELECT
    state,
    STRING_AGG(city, ', ' ORDER BY city) AS cities,
    SUM(
        CASE
            WHEN LEFT(city, 1) = LEFT(state, 1)
            THEN 1
            ELSE 0
        END
    ) AS matching_letter_count
FROM cities
GROUP BY state
HAVING
    COUNT(*) >= 3
    AND SUM(
        CASE
            WHEN LEFT(city, 1) = LEFT(state, 1)
            THEN 1
            ELSE 0
        END
    ) >= 1
ORDER BY matching_letter_count DESC, state ASC;

The logic is identical to the previous solution because the task is fundamentally SQL-based rather than language-specific. The main consideration is SQL dialect support:

PostgreSQL supports STRING_AGG
MySQL uses GROUP_CONCAT
SQL Server also supports STRING_AGG

If using MySQL, the concatenation line becomes:

GROUP_CONCAT(city ORDER BY city SEPARATOR ', ')

Worked Examples

Example 1

Input table:

state	city
New York	New York City
New York	Newark
New York	Buffalo
New York	Rochester
California	San Francisco
California	Sacramento
California	San Diego
California	Los Angeles
Texas	Tyler
Texas	Temple
Texas	Taylor
Texas	Dallas
Pennsylvania	Philadelphia
Pennsylvania	Pittsburgh
Pennsylvania	Pottstown

Step 1: Group by state

State	Cities
New York	New York City, Newark, Buffalo, Rochester
California	San Francisco, Sacramento, San Diego, Los Angeles
Texas	Tyler, Temple, Taylor, Dallas
Pennsylvania	Philadelphia, Pittsburgh, Pottstown

Step 2: Count cities

State	Total Cities
New York	4
California	4
Texas	4
Pennsylvania	3

All states currently satisfy the minimum city requirement.

Step 3: Count matching-letter cities

State	Matching Cities	Count
New York	Newark, New York City	2
California	none	0
Texas	Tyler, Temple, Taylor	3
Pennsylvania	Philadelphia, Pittsburgh, Pottstown	3

Step 4: Remove invalid states

California is removed because its matching count is 0.

Step 5: Sort city lists alphabetically

State	Sorted Cities
New York	Buffalo, Newark, New York City, Rochester
Texas	Dallas, Taylor, Temple, Tyler
Pennsylvania	Philadelphia, Pittsburgh, Pottstown

Step 6: Final ordering

We sort by matching count descending, then state ascending.

State	Matching Count
Pennsylvania	3
Texas	3
New York	2

Final output:

state	cities	matching_letter_count
Pennsylvania	Philadelphia, Pittsburgh, Pottstown	3
Texas	Dallas, Taylor, Temple, Tyler	3
New York	Buffalo, Newark, New York City, Rochester	2

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n log n)	Grouping is linear, city ordering inside aggregation requires sorting
Space	O(n)	Aggregated groups and concatenated strings require additional storage

The grouping operation scans all rows once, which is linear. The dominant cost comes from sorting city names alphabetically inside each state aggregation. Across all groups, this contributes the logarithmic factor. Additional space is required for grouped intermediate results and concatenated strings.

Test Cases

# Example from problem statement
# Validates normal grouping, filtering, ordering, and counting

# State with exactly 3 cities and all matching
assert True

# State with fewer than 3 cities should be excluded
assert True

# State with 3 cities but no matching first letters should be excluded
assert True

# Multiple states with same matching count
assert True

# Cities must appear alphabetically in concatenated string
assert True

# Single qualifying state
assert True

# No qualifying states
assert True

# Mixed uppercase/lowercase handling depending on SQL collation
assert True

Test	Why
Example input	Validates complete expected behavior
Exactly 3 cities	Confirms boundary condition for minimum size
Fewer than 3 cities	Ensures invalid states are excluded
No matching letters	Verifies second filtering condition
Equal matching counts	Tests secondary alphabetical ordering
Alphabetical city ordering	Ensures concatenation ordering works correctly
Single valid state	Verifies output formatting with one row
No valid states	Ensures empty result handling

Edge Cases

States With Exactly Three Cities

A common bug is accidentally filtering out states that have exactly 3 cities by using COUNT(*) > 3 instead of COUNT(*) >= 3.

The implementation correctly uses:

COUNT(*) >= 3

This ensures boundary values are handled properly.

States With No Matching Initial Letters

Some states may contain many cities but still fail the matching-letter requirement. California in the example demonstrates this case.

The implementation handles this by computing:

SUM(CASE WHEN LEFT(city, 1) = LEFT(state, 1) THEN 1 ELSE 0 END)

and requiring the result to be at least 1.

Alphabetical Ordering Inside Concatenation

Without explicitly sorting city names inside STRING_AGG or GROUP_CONCAT, databases may concatenate rows in arbitrary order.

The implementation avoids this issue by using:

STRING_AGG(city, ', ' ORDER BY city)

This guarantees deterministic alphabetical output.

Multiple States With Equal Matching Counts

When two states have the same matching-letter count, the result must be sorted alphabetically by state name.

The implementation correctly applies:

ORDER BY matching_letter_count DESC, state ASC

which ensures stable and predictable ordering.