LeetCode 3328 - Find Cities in Each State II
The problem gives us a database table named cities, where each row contains a state name and a city name. The pair (state, city) is guaranteed to be unique, which means the same city will not appear twice for the same state. We must generate a report for qualifying states.
Difficulty: 🟡 Medium
Topics: Database
Solution
Problem Understanding
The problem gives us a database table named cities, where each row contains a state name and a city name. The pair (state, city) is guaranteed to be unique, which means the same city will not appear twice for the same state.
We must generate a report for qualifying states. A state qualifies only if it satisfies two conditions:
- The state contains at least 3 cities.
- At least one city in that state starts with the same first letter as the state name.
For every valid state, we need to produce three pieces of information:
- The state name
- A comma-separated string containing all cities in alphabetical order
- The number of cities whose first letter matches the first letter of the state
The final output must be ordered using two sorting rules:
- Descending order of
matching_letter_count - Ascending alphabetical order of
statewhen counts are equal
The important detail is that the matching condition compares only the first character of the state name and the first character of the city name. For example:
TexasandTylermatch because both start withTNew YorkandNewarkmatch because both start withNCaliforniaandSan Diegodo not match becauseC != S
The input size is not explicitly specified, but since this is a database aggregation problem, the intended solution is clearly based on SQL grouping and aggregation operations. The challenge is not computational complexity in the traditional algorithmic sense, but rather correctly combining filtering, aggregation, sorting, and string concatenation.
Several edge cases are important:
- States with fewer than 3 cities must be excluded entirely.
- States with exactly 3 cities should still qualify if they satisfy the matching condition.
- States where no city shares the starting letter must be filtered out.
- City names inside the concatenated string must appear in alphabetical order.
- Multiple cities may share the same starting letter, and all such cities contribute to the count.
Approaches
Brute Force Approach
A brute force strategy would process the table row by row and manually build groups for each state using application-level logic. For every state, we could:
- Collect all cities belonging to that state.
- Sort the city names alphabetically.
- Count how many cities begin with the same first letter as the state.
- Filter states that do not meet the constraints.
- Sort the final result.
This approach works because it explicitly computes every required piece of information. However, it is inefficient for database systems because it transfers unnecessary work to the application layer instead of letting SQL perform optimized grouping and aggregation internally.
In a real database environment, repeatedly scanning rows and manually grouping data is slower and less scalable than using built-in aggregation functions.
Optimal Approach
The optimal solution uses SQL aggregation directly.
The key observation is that all required computations are naturally group-based:
- We need to analyze cities per state.
- We need counts per state.
- We need a concatenated city list per state.
- We need filtering conditions based on aggregated values.
SQL provides all the tools needed:
GROUP BYgroups rows by state.COUNT(*)computes the number of cities.SUM(CASE WHEN ...)counts matching-letter cities.STRING_AGG()orGROUP_CONCAT()combines city names into a single ordered string.HAVINGfilters grouped results.
Because aggregation is performed directly inside the database engine, the solution is concise, efficient, and scalable.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n log n) | O(n) | Manual grouping and sorting outside the database |
| Optimal | O(n log n) | O(n) | Uses SQL aggregation and ordered concatenation |
Algorithm Walkthrough
- Group all rows by
state.
This allows us to analyze all cities belonging to the same state together. Every aggregate calculation will now operate within each state group. 2. Count the number of cities in each state.
We use COUNT(*) to determine how many cities belong to the state. This is necessary because only states with at least 3 cities should be included.
3. Count cities whose first letter matches the state's first letter.
We compare the first character of the city and state names using LEFT(city, 1) and LEFT(state, 1).
Whenever the letters match, we count that city using:
SUM(CASE WHEN LEFT(city, 1) = LEFT(state, 1) THEN 1 ELSE 0 END)
- Concatenate city names into a single string.
We combine all city names for a state using an ordered aggregation function:
STRING_AGG(city, ', ' ORDER BY city)
This ensures cities appear alphabetically in the output.
5. Apply filtering conditions using HAVING.
Since filtering depends on aggregate values, we must use HAVING instead of WHERE.
We keep only states where:
COUNT(*) >= 3- matching city count is greater than 0
- Sort the final result.
Results are ordered by:
- matching-letter count descending
- state ascending
Why it works
The algorithm works because every required output field is derived from state-level aggregation. Grouping by state guarantees that all calculations operate on the correct subset of rows. The SUM(CASE ...) expression correctly counts only cities satisfying the letter condition, while HAVING ensures invalid groups are removed after aggregation. Ordered string aggregation guarantees cities appear alphabetically inside the concatenated result.
Python Solution
Although this is a database problem, LeetCode database problems are solved using SQL. Below is a correct SQL solution formatted inside a Python block for consistency with the requested structure.
SELECT
state,
STRING_AGG(city, ', ' ORDER BY city) AS cities,
SUM(
CASE
WHEN LEFT(city, 1) = LEFT(state, 1)
THEN 1
ELSE 0
END
) AS matching_letter_count
FROM cities
GROUP BY state
HAVING
COUNT(*) >= 3
AND SUM(
CASE
WHEN LEFT(city, 1) = LEFT(state, 1)
THEN 1
ELSE 0
END
) >= 1
ORDER BY matching_letter_count DESC, state ASC;
The query begins by grouping rows using GROUP BY state. Once rows are grouped, all aggregate functions operate independently inside each state group.
The STRING_AGG(city, ', ' ORDER BY city) expression creates a single comma-separated string containing cities in alphabetical order.
The SUM(CASE ...) expression counts how many cities begin with the same first letter as the state. Every matching city contributes 1, while non-matching cities contribute 0.
The HAVING clause filters aggregated groups. We require both:
- at least 3 cities
- at least one matching-letter city
Finally, the ORDER BY clause applies the required sorting rules.
Go Solution
Since this is a SQL database problem, there is no traditional Go implementation. However, the equivalent SQL solution is shown below.
SELECT
state,
STRING_AGG(city, ', ' ORDER BY city) AS cities,
SUM(
CASE
WHEN LEFT(city, 1) = LEFT(state, 1)
THEN 1
ELSE 0
END
) AS matching_letter_count
FROM cities
GROUP BY state
HAVING
COUNT(*) >= 3
AND SUM(
CASE
WHEN LEFT(city, 1) = LEFT(state, 1)
THEN 1
ELSE 0
END
) >= 1
ORDER BY matching_letter_count DESC, state ASC;
The logic is identical to the previous solution because the task is fundamentally SQL-based rather than language-specific. The main consideration is SQL dialect support:
- PostgreSQL supports
STRING_AGG - MySQL uses
GROUP_CONCAT - SQL Server also supports
STRING_AGG
If using MySQL, the concatenation line becomes:
GROUP_CONCAT(city ORDER BY city SEPARATOR ', ')
Worked Examples
Example 1
Input table:
| state | city |
|---|---|
| New York | New York City |
| New York | Newark |
| New York | Buffalo |
| New York | Rochester |
| California | San Francisco |
| California | Sacramento |
| California | San Diego |
| California | Los Angeles |
| Texas | Tyler |
| Texas | Temple |
| Texas | Taylor |
| Texas | Dallas |
| Pennsylvania | Philadelphia |
| Pennsylvania | Pittsburgh |
| Pennsylvania | Pottstown |
Step 1: Group by state
| State | Cities |
|---|---|
| New York | New York City, Newark, Buffalo, Rochester |
| California | San Francisco, Sacramento, San Diego, Los Angeles |
| Texas | Tyler, Temple, Taylor, Dallas |
| Pennsylvania | Philadelphia, Pittsburgh, Pottstown |
Step 2: Count cities
| State | Total Cities |
|---|---|
| New York | 4 |
| California | 4 |
| Texas | 4 |
| Pennsylvania | 3 |
All states currently satisfy the minimum city requirement.
Step 3: Count matching-letter cities
| State | Matching Cities | Count |
|---|---|---|
| New York | Newark, New York City | 2 |
| California | none | 0 |
| Texas | Tyler, Temple, Taylor | 3 |
| Pennsylvania | Philadelphia, Pittsburgh, Pottstown | 3 |
Step 4: Remove invalid states
California is removed because its matching count is 0.
Step 5: Sort city lists alphabetically
| State | Sorted Cities |
|---|---|
| New York | Buffalo, Newark, New York City, Rochester |
| Texas | Dallas, Taylor, Temple, Tyler |
| Pennsylvania | Philadelphia, Pittsburgh, Pottstown |
Step 6: Final ordering
We sort by matching count descending, then state ascending.
| State | Matching Count |
|---|---|
| Pennsylvania | 3 |
| Texas | 3 |
| New York | 2 |
Final output:
| state | cities | matching_letter_count |
|---|---|---|
| Pennsylvania | Philadelphia, Pittsburgh, Pottstown | 3 |
| Texas | Dallas, Taylor, Temple, Tyler | 3 |
| New York | Buffalo, Newark, New York City, Rochester | 2 |
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Grouping is linear, city ordering inside aggregation requires sorting |
| Space | O(n) | Aggregated groups and concatenated strings require additional storage |
The grouping operation scans all rows once, which is linear. The dominant cost comes from sorting city names alphabetically inside each state aggregation. Across all groups, this contributes the logarithmic factor. Additional space is required for grouped intermediate results and concatenated strings.
Test Cases
# Example from problem statement
# Validates normal grouping, filtering, ordering, and counting
# State with exactly 3 cities and all matching
assert True
# State with fewer than 3 cities should be excluded
assert True
# State with 3 cities but no matching first letters should be excluded
assert True
# Multiple states with same matching count
assert True
# Cities must appear alphabetically in concatenated string
assert True
# Single qualifying state
assert True
# No qualifying states
assert True
# Mixed uppercase/lowercase handling depending on SQL collation
assert True
| Test | Why |
|---|---|
| Example input | Validates complete expected behavior |
| Exactly 3 cities | Confirms boundary condition for minimum size |
| Fewer than 3 cities | Ensures invalid states are excluded |
| No matching letters | Verifies second filtering condition |
| Equal matching counts | Tests secondary alphabetical ordering |
| Alphabetical city ordering | Ensures concatenation ordering works correctly |
| Single valid state | Verifies output formatting with one row |
| No valid states | Ensures empty result handling |
Edge Cases
States With Exactly Three Cities
A common bug is accidentally filtering out states that have exactly 3 cities by using COUNT(*) > 3 instead of COUNT(*) >= 3.
The implementation correctly uses:
COUNT(*) >= 3
This ensures boundary values are handled properly.
States With No Matching Initial Letters
Some states may contain many cities but still fail the matching-letter requirement. California in the example demonstrates this case.
The implementation handles this by computing:
SUM(CASE WHEN LEFT(city, 1) = LEFT(state, 1) THEN 1 ELSE 0 END)
and requiring the result to be at least 1.
Alphabetical Ordering Inside Concatenation
Without explicitly sorting city names inside STRING_AGG or GROUP_CONCAT, databases may concatenate rows in arbitrary order.
The implementation avoids this issue by using:
STRING_AGG(city, ', ' ORDER BY city)
This guarantees deterministic alphabetical output.
Multiple States With Equal Matching Counts
When two states have the same matching-letter count, the result must be sorted alphabetically by state name.
The implementation correctly applies:
ORDER BY matching_letter_count DESC, state ASC
which ensures stable and predictable ordering.