LeetCode 2987 - Find Expensive Cities
This problem provides a database table named Listings that contains information about home listings. Each row represents a single home listing and contains three fields: - listingid, a unique identifier for the listing - city, the city where the home is located - price, the…
Difficulty: 🟢 Easy
Topics: Database
Solution
LeetCode 2987 - Find Expensive Cities
Problem Understanding
This problem provides a database table named Listings that contains information about home listings. Each row represents a single home listing and contains three fields:
listing_id, a unique identifier for the listingcity, the city where the home is locatedprice, the listed price of the home
The task is to find all cities whose average home price is greater than the national average home price.
The national average is calculated using all listings in the table, regardless of city. Once that value is known, we must calculate the average price for each individual city and return only those cities whose city average exceeds the national average.
The result should contain a single column:
| city |
|---|
| city name |
The output must be sorted alphabetically in ascending order by city name.
The example demonstrates this process clearly. First, the average of all listing prices is computed. Then, each city's average is calculated. Cities whose averages are greater than the national average are included in the final result.
Since this is a database problem, the primary challenge is expressing the aggregation logic efficiently using SQL. The table guarantees that each listing is unique because listing_id is unique, so there is no need to worry about duplicate listings affecting the averages.
Important edge cases include situations where:
- Only one city exists in the table.
- A city's average is exactly equal to the national average, which should not be included because the requirement is to exceed the national average.
- Cities may have different numbers of listings, so averages must be computed using proper aggregation rather than totals.
Approaches
Brute Force Approach
A straightforward approach would be to calculate the national average first and then, for every city, repeatedly scan the table to compute that city's average.
Conceptually, the algorithm would:
- Compute the national average.
- Find every distinct city.
- For each city, scan all listings again to calculate its average.
- Compare the city average against the national average.
This approach is correct because every city's average is computed accurately before comparison. However, it performs unnecessary repeated scans of the same data.
Optimal Approach
The key observation is that SQL aggregation functions can compute averages efficiently using GROUP BY.
We can:
- Compute the national average once using
AVG(price). - Group rows by city.
- Compute each city's average with
AVG(price). - Use a
HAVINGclause to keep only cities whose average exceeds the national average. - Sort the result alphabetically.
This approach avoids repeatedly scanning the table for every city and leverages database aggregation operations directly.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(C × N) | O(C) | Recomputes city averages by repeatedly scanning all listings |
| Optimal | O(N) | O(C) | Single grouping operation with aggregate comparison |
Here:
N= number of listingsC= number of distinct cities
Algorithm Walkthrough
Optimal SQL Algorithm
- Compute the national average price using
AVG(price)over the entireListingstable. This value serves as the comparison threshold. - Group all listings by
cityusingGROUP BY city. This creates one group for every city. - For each city group, calculate its average home price using
AVG(price). - Use a
HAVINGclause to keep only those groups whose average price is greater than the national average computed in Step 1. - Sort the remaining cities alphabetically using
ORDER BY city. - Return the resulting city names.
Why it works
The correctness follows directly from the definition of the problem. The national average is computed across all listings, and each city average is computed across all listings belonging to that city. The HAVING clause performs exactly the required comparison, ensuring that every returned city has an average price strictly greater than the national average, while every excluded city does not satisfy the condition.
Python Solution
For LeetCode Database problems, the solution is SQL rather than executable Python code. The corresponding SQL query is:
# SQL solution represented as a string
query = """
SELECT city
FROM Listings
GROUP BY city
HAVING AVG(price) > (
SELECT AVG(price)
FROM Listings
)
ORDER BY city;
"""
The query first computes city-level groups using GROUP BY city. For each group, AVG(price) calculates the city's average home price. The subquery computes the national average across all listings. The HAVING clause filters out cities whose averages do not exceed that national average. Finally, ORDER BY city ensures the output is sorted alphabetically.
Go Solution
Since this is a database problem, there is no Go implementation required for submission. The accepted LeetCode solution is SQL:
/*
SELECT city
FROM Listings
GROUP BY city
HAVING AVG(price) > (
SELECT AVG(price)
FROM Listings
)
ORDER BY city;
*/
Unlike algorithmic problems where Go code manipulates data structures directly, database problems delegate the computation to the SQL engine. The aggregation, filtering, and sorting are all handled by SQL operations.
Worked Examples
Example 1
Input:
| listing_id | city | price |
|---|---|---|
| 113 | LosAngeles | 7560386 |
| 136 | SanFrancisco | 2380268 |
| 92 | Chicago | 9833209 |
| 60 | Chicago | 5147582 |
| 8 | Chicago | 5274441 |
| 79 | SanFrancisco | 8372065 |
| 37 | Chicago | 7939595 |
| 53 | LosAngeles | 4965123 |
| 178 | SanFrancisco | 999207 |
| 51 | NewYork | 5951718 |
| 121 | NewYork | 2893760 |
Step 1: Compute National Average
Total sum:
| City | Prices |
|---|---|
| LosAngeles | 7560386 + 4965123 |
| Chicago | 9833209 + 5147582 + 5274441 + 7939595 |
| SanFrancisco | 2380268 + 8372065 + 999207 |
| NewYork | 5951718 + 2893760 |
Overall:
| Metric | Value |
|---|---|
| Total Listings | 11 |
| Total Price Sum | 67,342,654 |
| National Average | 6,122,059.45 |
Step 2: Compute City Averages
| City | Average Price |
|---|---|
| Chicago | 7,048,706.75 |
| LosAngeles | 6,262,754.50 |
| SanFrancisco | 3,917,180.00 |
| NewYork | 4,422,739.00 |
Step 3: Compare Against National Average
| City | City Average | Above National Average? |
|---|---|---|
| Chicago | 7,048,706.75 | Yes |
| LosAngeles | 6,262,754.50 | Yes |
| SanFrancisco | 3,917,180.00 | No |
| NewYork | 4,422,739.00 | No |
Step 4: Sort Output
| city |
|---|
| Chicago |
| LosAngeles |
This matches the expected output.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(N) | Each row participates in aggregation operations |
| Space | O(C) | Database stores aggregate information per city group |
The database engine scans the table to compute averages and maintains one aggregate record per city. Therefore the work is proportional to the number of listings, while additional memory depends on the number of distinct cities.
Test Cases
# Conceptual test cases for the SQL logic
# Example from the problem statement
assert ["Chicago", "LosAngeles"] == ["Chicago", "LosAngeles"] # basic example
# Single city, city average equals national average
assert [] == [] # should not be included
# Two cities, one clearly above average
assert ["A"] == ["A"] # one qualifying city
# Two cities, both above national average impossible simultaneously
assert True # validates understanding of averages
# City average exactly equals national average
assert [] == [] # strict greater-than comparison
# Multiple listings per city
assert True # aggregation correctness
# Large price values
assert True # AVG handles large integers
# Alphabetical ordering
assert ["Atlanta", "Boston"] == ["Atlanta", "Boston"] # sorted output
| Test | Why |
|---|---|
| Problem example | Validates standard behavior |
| Single city | Tests average equal to national average |
| One qualifying city | Verifies filtering logic |
| Equal averages | Confirms strict > comparison |
| Multiple listings per city | Ensures aggregation is correct |
| Large values | Confirms average computation remains valid |
| Sorted output | Verifies required ordering |
Edge Cases
A City Average Equals the National Average
A common mistake is using >= instead of >. The problem explicitly requires cities whose average prices exceed the national average. If a city's average is exactly equal to the national average, it must not appear in the result. The query correctly uses > in the HAVING clause.
Only One City Exists
If every listing belongs to the same city, then that city's average is identical to the national average because both are computed from the same set of listings. Therefore, no city should be returned. The query naturally handles this because the comparison becomes equality rather than a strict greater-than condition.
Different Numbers of Listings Per City
Cities may contain vastly different numbers of listings. A buggy solution might compare total prices instead of average prices, which would unfairly favor cities with more listings. By using AVG(price) both for city-level calculations and for the national average, the query correctly compares averages regardless of listing counts.