LeetCode 1693 - Daily Leads and Partners

The problem provides a database table named DailySales, where each row represents a sale event involving a product manuf

LeetCode Problem 1693

Difficulty: 🟢 Easy
Topics: Database

Solution

Problem Understanding

The problem provides a database table named DailySales, where each row represents a sale event involving a product manufacturer (make_name), a lead (lead_id), and a partner (partner_id) on a specific date (date_id).

The goal is to compute, for every unique combination of date_id and make_name, two values:

  1. The number of distinct lead IDs associated with that date and manufacturer.
  2. The number of distinct partner IDs associated with that date and manufacturer.

The important detail is that the table does not have a primary key, meaning duplicate rows may exist. This means we cannot simply count rows or assume uniqueness. Instead, we must count unique values only, using distinct semantics.

In other words, we are grouping the data by (date_id, make_name) and then calculating:

  • unique_leads = number of distinct lead_id values in that group
  • unique_partners = number of distinct partner_id values in that group

The output may be returned in any order, so sorting is not required unless explicitly requested.

For example, suppose we have:

date_id make_name lead_id partner_id
2020-12-8 toyota 0 1
2020-12-8 toyota 1 0
2020-12-8 toyota 1 2

For (2020-12-8, toyota):

  • Distinct lead_id values are {0, 1}, count = 2
  • Distinct partner_id values are {0, 1, 2}, count = 3

So the result row becomes:

date_id make_name unique_leads unique_partners
2020-12-8 toyota 2 3

Since the problem belongs to the Database category on LeetCode, the expected solution is written in SQL, not Python or Go.

The constraints are relatively small for an SQL aggregation problem, and the structure of the task strongly suggests the use of GROUP BY combined with COUNT(DISTINCT ...).

An important edge case is the presence of duplicate rows. If the same (lead_id, partner_id) pair appears multiple times for the same manufacturer and date, we must still count each ID only once. Another edge case is when multiple rows contain the same lead but different partners, or vice versa. We must count uniqueness independently for leads and partners.

Approaches

Brute Force Approach

A brute-force solution would conceptually process every unique (date_id, make_name) pair separately. For each group, we could scan the entire table and manually maintain collections of seen lead_id and partner_id values.

This works because every row belonging to the same (date_id, make_name) combination contributes possible lead and partner IDs. Using a set structure guarantees uniqueness.

However, this approach is inefficient because for every unique group we repeatedly scan the whole table. If there are g groups and n rows, the worst-case complexity becomes O(g × n), which is unnecessary.

Optimal Approach

The key observation is that SQL databases are designed for grouped aggregation. Since the problem explicitly asks for counts per (date_id, make_name) pair, GROUP BY naturally fits.

Within each group, SQL provides COUNT(DISTINCT column) to count unique values efficiently. This avoids duplicate counting automatically and eliminates the need for manual tracking.

By grouping once and computing both distinct counts in the same query, we solve the problem efficiently and cleanly.

Approach Time Complexity Space Complexity Notes
Brute Force O(g × n) O(n) Repeatedly scans the table for each group
Optimal O(n log n) O(n) Uses GROUP BY with COUNT(DISTINCT)

Algorithm Walkthrough

  1. Group the rows by date_id and make_name.

This step ensures that all sales belonging to the same manufacturer on the same day are processed together. Since the result requires one row per (date_id, make_name) pair, this grouping is the foundation of the solution. 2. Count distinct lead_id values inside each group.

We use COUNT(DISTINCT lead_id) so repeated occurrences of the same lead are counted only once. This correctly handles duplicate rows and repeated leads. 3. Count distinct partner_id values inside each group.

Similarly, COUNT(DISTINCT partner_id) ensures each partner contributes at most one count per group. 4. Return the grouped results.

Each group becomes one output row containing:

  • date_id
  • make_name
  • unique_leads
  • unique_partners

Why it works

The correctness comes from the fact that grouping isolates rows belonging to the same (date_id, make_name) pair. Within each group, COUNT(DISTINCT ...) guarantees that repeated IDs are counted exactly once. Since the problem definition explicitly asks for distinct counts, this matches the required behavior precisely.

SQL Solution

SELECT
    date_id,
    make_name,
    COUNT(DISTINCT lead_id) AS unique_leads,
    COUNT(DISTINCT partner_id) AS unique_partners
FROM DailySales
GROUP BY date_id, make_name;

The query first groups rows by date_id and make_name. After grouping, COUNT(DISTINCT lead_id) calculates the number of unique leads, while COUNT(DISTINCT partner_id) calculates the number of unique partners. Because duplicates are ignored by DISTINCT, the solution remains correct even when the table contains repeated rows.

Worked Examples

Example 1

Input:

date_id make_name lead_id partner_id
2020-12-8 toyota 0 1
2020-12-8 toyota 1 0
2020-12-8 toyota 1 2
2020-12-7 toyota 0 2
2020-12-7 toyota 0 1
2020-12-8 honda 1 2
2020-12-8 honda 2 1
2020-12-7 honda 0 1
2020-12-7 honda 1 2
2020-12-7 honda 2 1

After grouping:

Group Lead Set Partner Set
(2020-12-8, toyota) {0, 1} {0, 1, 2}
(2020-12-7, toyota) {0} {1, 2}
(2020-12-8, honda) {1, 2} {1, 2}
(2020-12-7, honda) {0, 1, 2} {1, 2}

Final output:

date_id make_name unique_leads unique_partners
2020-12-8 toyota 2 3
2020-12-7 toyota 1 2
2020-12-8 honda 2 2
2020-12-7 honda 3 2

Complexity Analysis

Measure Complexity Explanation
Time O(n log n) Grouping and distinct aggregation require processing all rows
Space O(n) Database engine stores grouping and distinct information

The query processes every row in the table once while grouping records by (date_id, make_name). The exact implementation details depend on the database engine, but in practice this is efficient and scales well for the problem size.

Test Cases

-- Basic example from the problem statement
INSERT INTO DailySales VALUES
('2020-12-8', 'toyota', 0, 1),
('2020-12-8', 'toyota', 1, 0),
('2020-12-8', 'toyota', 1, 2);

-- Duplicate rows should not affect counts
INSERT INTO DailySales VALUES
('2020-12-8', 'toyota', 1, 2),
('2020-12-8', 'toyota', 1, 2);

-- Single row case
INSERT INTO DailySales VALUES
('2020-12-9', 'honda', 5, 10);

-- Same lead, different partners
INSERT INTO DailySales VALUES
('2020-12-10', 'ford', 1, 2),
('2020-12-10', 'ford', 1, 3);

-- Same partner, different leads
INSERT INTO DailySales VALUES
('2020-12-11', 'bmw', 1, 5),
('2020-12-11', 'bmw', 2, 5);
Test Why
Problem example Validates standard functionality
Duplicate rows Ensures DISTINCT prevents overcounting
Single row Confirms minimal input works
Same lead, different partners Verifies independent partner counting
Same partner, different leads Verifies independent lead counting

Edge Cases

One important edge case is duplicate rows. Since the table has no primary key, identical rows may appear multiple times. A naive COUNT(*) approach would overcount results. Using COUNT(DISTINCT ...) ensures duplicates do not affect the final counts.

Another important edge case is when the same lead appears with multiple partners. For example, one lead may interact with several partners on the same day. We should count the lead only once, while still counting all distinct partners. The solution handles this because lead and partner uniqueness are computed independently.

A third edge case occurs when the same partner appears with multiple leads. In this situation, partner counts must remain unique while lead counts increase appropriately. Again, separate COUNT(DISTINCT ...) operations ensure both metrics remain accurate.

Finally, a group may contain only one row. In that case, both unique counts should be 1, assuming the IDs are valid. Since aggregation still works for single-row groups, the implementation handles this case naturally.