LeetCode 1421 - NPV Queries
The problem presents two database tables: NPV and Queries. The NPV table contains historical net present value (NPV) dat
Difficulty: 🟢 Easy
Topics: Database
Solution
Problem Understanding
The problem presents two database tables: NPV and Queries. The NPV table contains historical net present value (NPV) data for different inventory items identified by an id and a year. The Queries table contains requests for the NPV of specific inventory items at specific years. Each row in Queries represents a single query.
The task is to return a result table where each query from Queries is mapped to its corresponding npv value from the NPV table. If a queried (id, year) pair does not exist in the NPV table, the npv should default to 0. There is no restriction on the order of the output rows.
Key points include the primary key constraints: (id, year) is unique in both tables. This ensures that for any (id, year) pair, there is at most one matching NPV record. Edge cases involve queries for (id, year) pairs missing in the NPV table, which should return zero.
Approaches
A naive or brute-force approach would involve iterating over every query in the Queries table and performing a search through the entire NPV table to find a matching (id, year) pair. While this guarantees correctness, it is inefficient because the time complexity is O(Q * N), where Q is the number of queries and N is the number of rows in the NPV table. For large tables, this becomes computationally expensive.
The optimal approach leverages a hash-based join. By creating a dictionary (hash map) from the NPV table where the key is a tuple (id, year) and the value is the npv, we can answer each query in constant time O(1). This reduces the overall time complexity to O(N + Q), which is far more efficient.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(Q * N) | O(1) | Iterate over every query and search NPV for a match |
| Optimal | O(N + Q) | O(N) | Use a hash map to store NPV values and lookup queries efficiently |
Algorithm Walkthrough
- Initialize an empty dictionary
npv_mapthat will map(id, year)tuples to their correspondingnpvvalues from theNPVtable. This allows constant-time lookups. - Iterate over each row in the
NPVtable. For each row, insert an entry intonpv_mapwith the key(row.id, row.year)and the valuerow.npv. - Initialize an empty list
resultto store the final output. - Iterate over each row in the
Queriestable. For each query(id, year), attempt to fetch the correspondingnpvfromnpv_map. If the key does not exist, default to0. - Append a dictionary representing the query and its NPV to the
resultlist. - Return the
resultlist.
Why it works: The dictionary guarantees that every (id, year) pair from NPV is mapped correctly. Lookup for each query is O(1). If a query is missing in NPV, using a default of 0 ensures correctness. Since every query is handled exactly once, the result is correct for all queries.
Python Solution
import pandas as pd
def npv_queries(NPV: pd.DataFrame, Queries: pd.DataFrame) -> pd.DataFrame:
# Step 1: Build the mapping from (id, year) -> npv
npv_map = {(row.id, row.year): row.npv for row in NPV.itertuples(index=False)}
# Step 2: Prepare the result list
result = []
for row in Queries.itertuples(index=False):
npv_value = npv_map.get((row.id, row.year), 0)
result.append({'id': row.id, 'year': row.year, 'npv': npv_value})
# Step 3: Convert to DataFrame
return pd.DataFrame(result)
Explanation: The dictionary comprehension constructs the hash map for fast lookups. Iterating through Queries allows us to fetch each NPV efficiently, defaulting to zero for missing entries. Finally, the result list is converted back to a DataFrame for submission.
Go Solution
package main
type NPVRecord struct {
Id int
Year int
Npv int
}
type Query struct {
Id int
Year int
}
type Result struct {
Id int
Year int
Npv int
}
func NPVQueries(npvTable []NPVRecord, queries []Query) []Result {
npvMap := make(map[[2]int]int)
// Build the map
for _, row := range npvTable {
key := [2]int{row.Id, row.Year}
npvMap[key] = row.Npv
}
// Prepare the result
results := make([]Result, 0, len(queries))
for _, q := range queries {
key := [2]int{q.Id, q.Year}
npvValue, exists := npvMap[key]
if !exists {
npvValue = 0
}
results = append(results, Result{Id: q.Id, Year: q.Year, Npv: npvValue})
}
return results
}
Explanation: Go uses a map with [2]int as the key to store (id, year) combinations. Lookup for each query uses exists to handle missing keys, defaulting to zero. The result is appended to a slice of Result.
Worked Examples
For the example in the problem statement:
- Build
npv_mapfromNPV:
| Key | Value |
|---|---|
| (1,2018) | 100 |
| (7,2020) | 30 |
| (13,2019) | 40 |
| (1,2019) | 113 |
| (2,2008) | 121 |
| (3,2009) | 12 |
| (11,2020) | 99 |
| (7,2019) | 0 |
- Iterate through
Queries:
| Query | npv lookup | Result |
|---|---|---|
| (1,2019) | 113 | 113 |
| (2,2008) | 121 | 121 |
| (3,2009) | 12 | 12 |
| (7,2018) | missing | 0 |
| (7,2019) | 0 | 0 |
| (7,2020) | 30 | 30 |
| (13,2019) | 40 | 40 |
Final output matches the problem example.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(N + Q) | Building the map is O(N), and querying all Q entries is O(Q) |
| Space | O(N) | The map stores all NPV rows |
The complexity is efficient because lookups are constant-time and we iterate through each table only once.
Test Cases
import pandas as pd
# Example test case
NPV = pd.DataFrame({
'id': [1,7,13,1,2,3,11,7],
'year': [2018,2020,2019,2019,2008,2009,2020,2019],
'npv': [100,30,40,113,121,12,99,0]
})
Queries = pd.DataFrame({
'id': [1,2,3,7,7,7,13],
'year': [2019,2008,2009,2018,2019,2020,2019]
})
result = npv_queries(NPV, Queries)
assert result[result['id']==1]['npv'].iloc[0] == 113
assert result[result['id']==7][result['year']==2018]['npv'].iloc[0] == 0 # missing npv
# Edge case: empty NPV table
NPV_empty = pd.DataFrame(columns=['id','year','npv'])
Queries_small = pd.DataFrame({'id':[1],'year':[2000]})
assert npv_queries(NPV_empty, Queries_small)['npv'].iloc[0] == 0
# Edge case: empty Queries table
NPV_full = pd.DataFrame({'id':[1],'year':[2000],'npv':[50]})
Queries_empty = pd.DataFrame(columns=['id','year'])
assert npv_queries(NPV_full, Queries_empty).empty
| Test | Why |
|---|---|
| Provided example | Validates normal mapping and default zero handling |
| Empty NPV table | Ensures default zero for all queries |
| Empty Queries table | Confirms handling of no queries gracefully |
Edge Cases
1. Query not in NPV: When a query references an (id, year) not present in the NPV table, a naive solution may fail or return null. By using dict.get(key, 0), the implementation correctly returns zero.
2. Empty tables: If either `NP