LeetCode 811 - Subdomain Visit Count
In this problem, we are given a list of strings where each string represents a visit count paired with a domain name. Each input entry has the form: For example: means the domain discuss.leetcode.com was visited 9001 times.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Counting
Solution
Problem Understanding
In this problem, we are given a list of strings where each string represents a visit count paired with a domain name. Each input entry has the form:
"count domain"
For example:
"9001 discuss.leetcode.com"
means the domain discuss.leetcode.com was visited 9001 times.
The important detail is that visiting a domain also counts as visiting all of its parent subdomains. For the domain:
discuss.leetcode.com
the following domains are considered visited:
discuss.leetcode.com
leetcode.com
com
Therefore, the visit count must be added to every subdomain generated from the original domain.
The task is to compute the total number of visits for every possible subdomain and return them in the same "count domain" format. The output order does not matter.
The input size is relatively small:
- At most 100 domain entries
- Each string length is at most 100 characters
These constraints tell us that performance requirements are modest. Even moderately inefficient string operations will still pass comfortably. However, we still want a clean and scalable solution.
A key observation is that each domain contains at most three parts because the constraints specify formats like:
d1.d2
d1.d2.d3
This means each input generates only a small number of subdomains.
There are several important edge cases to consider:
- Multiple domains may contribute to the same subdomain, such as
google.mail.comandintel.mail.comboth contributing tomail.com - A domain may already be a top level domain like
com - Counts can accumulate across many entries
- Output order is arbitrary, so we should not rely on sorting
The problem guarantees valid formatting, so we do not need to handle malformed input.
Approaches
Brute Force Approach
A brute force solution would explicitly generate every possible subdomain for every input domain and then repeatedly scan the entire result set to merge counts.
For example, given:
"900 google.mail.com"
we generate:
google.mail.com
mail.com
com
Then, for each generated subdomain, we search through previously stored results to determine whether it already exists. If it does, we update its count. Otherwise, we insert a new entry.
This approach is correct because every valid subdomain is generated and every count is accumulated properly. However, repeated linear searches make it inefficient.
If there are N input domains and K total generated subdomains, repeatedly searching through existing results can lead to quadratic behavior.
Optimal Approach
The key insight is that this problem is fundamentally a counting problem.
Whenever we need to accumulate frequencies by key, a hash map is the natural data structure. Instead of repeatedly scanning existing results, we can directly update counts in constant average time.
For each input string:
- Split it into the count and domain.
- Generate every suffix subdomain.
- Add the count to that subdomain in a hash map.
For example:
google.mail.com
produces:
google.mail.com
mail.com
com
Each of these becomes a key in the hash map.
After processing all entries, we convert the hash map back into the required output format.
Because each domain has at most three components, the amount of work per input is very small.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(N²) | O(K) | Repeated searches through accumulated results |
| Optimal | O(N × L) | O(K) | Uses a hash map for direct frequency accumulation |
Here:
Nis the number of input domainsLis the number of domain segmentsKis the number of unique subdomains
Since L is very small in this problem, the optimal solution is effectively linear.
Algorithm Walkthrough
- Create a hash map to store visit counts for each subdomain.
The key will be the subdomain string, and the value will be the accumulated visit count.
2. Iterate through every entry in cpdomains.
Each string contains a count and a domain separated by a space. 3. Split the string into two parts.
For example:
"900 google.mail.com"
becomes:
count = 900
domain = "google.mail.com"
- Split the domain by
".".
Example:
["google", "mail", "com"]
- Generate every valid subdomain.
Starting from each position, join the remaining components:
"google.mail.com""mail.com""com"
- Add the visit count to each generated subdomain in the hash map.
This automatically merges counts from different domains that share the same parent domain. 7. After processing all inputs, iterate through the hash map.
Convert each key-value pair into the required format:
"count domain"
- Return the resulting list.
Why it works
The algorithm works because every visit to a domain necessarily implies a visit to all suffix subdomains. By explicitly generating all suffixes and adding the visit count to each one, we ensure every implied visit is counted exactly once.
The hash map guarantees that counts from multiple domains accumulate correctly for shared subdomains.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def subdomainVisits(self, cpdomains: List[str]) -> List[str]:
visit_counts = defaultdict(int)
for entry in cpdomains:
count_str, domain = entry.split()
count = int(count_str)
parts = domain.split(".")
for i in range(len(parts)):
subdomain = ".".join(parts[i:])
visit_counts[subdomain] += count
result = []
for domain, count in visit_counts.items():
result.append(f"{count} {domain}")
return result
The implementation begins by creating a defaultdict(int) so that missing keys automatically start at zero. This simplifies the counting logic because we can directly increment counts without checking whether the key already exists.
For each input string, we split it into the numeric count and the domain name. The count is converted into an integer for arithmetic operations.
Next, we split the domain into its components using ".". By iterating over every starting index, we generate all suffix subdomains using ".".join(parts[i:]).
Each generated subdomain receives the current visit count in the hash map.
Finally, we iterate through the hash map and construct the required output strings in "count domain" format.
Go Solution
package main
import (
"fmt"
"strconv"
"strings"
)
func subdomainVisits(cpdomains []string) []string {
visitCounts := make(map[string]int)
for _, entry := range cpdomains {
parts := strings.Split(entry, " ")
count, _ := strconv.Atoi(parts[0])
domain := parts[1]
domainParts := strings.Split(domain, ".")
for i := 0; i < len(domainParts); i++ {
subdomain := strings.Join(domainParts[i:], ".")
visitCounts[subdomain] += count
}
}
result := []string{}
for domain, count := range visitCounts {
result = append(result, fmt.Sprintf("%d %s", count, domain))
}
return result
}
The Go implementation follows the same logic as the Python version.
A map[string]int is used instead of a Python dictionary. String splitting is handled with strings.Split, and integer conversion uses strconv.Atoi.
Unlike Python's defaultdict, Go maps return the zero value for missing keys, so incrementing nonexistent entries works naturally.
The output slice is dynamically built using append.
Worked Examples
Example 1
Input:
["9001 discuss.leetcode.com"]
Step 1
Process:
"9001 discuss.leetcode.com"
Extract:
| Variable | Value |
|---|---|
| count | 9001 |
| domain | discuss.leetcode.com |
Step 2
Split domain:
| Index | Part |
|---|---|
| 0 | discuss |
| 1 | leetcode |
| 2 | com |
Step 3
Generate subdomains and update counts.
| Generated Subdomain | Added Count | Hash Map State |
|---|---|---|
| discuss.leetcode.com | 9001 | {"discuss.leetcode.com": 9001} |
| leetcode.com | 9001 | {"discuss.leetcode.com": 9001, "leetcode.com": 9001} |
| com | 9001 | {"discuss.leetcode.com": 9001, "leetcode.com": 9001, "com": 9001} |
Final Output
[
"9001 discuss.leetcode.com",
"9001 leetcode.com",
"9001 com"
]
Example 2
Input:
["900 google.mail.com", "50 yahoo.com", "1 intel.mail.com", "5 wiki.org"]
Processing "900 google.mail.com"
| Subdomain | Updated Count |
|---|---|
| google.mail.com | 900 |
| mail.com | 900 |
| com | 900 |
Processing "50 yahoo.com"
| Subdomain | Updated Count |
|---|---|
| yahoo.com | 50 |
| com | 950 |
Processing "1 intel.mail.com"
| Subdomain | Updated Count |
|---|---|
| intel.mail.com | 1 |
| mail.com | 901 |
| com | 951 |
Processing "5 wiki.org"
| Subdomain | Updated Count |
|---|---|
| wiki.org | 5 |
| org | 5 |
Final Hash Map
| Domain | Count |
|---|---|
| google.mail.com | 900 |
| mail.com | 901 |
| com | 951 |
| yahoo.com | 50 |
| intel.mail.com | 1 |
| wiki.org | 5 |
| org | 5 |
Final Output
[
"901 mail.com",
"50 yahoo.com",
"900 google.mail.com",
"5 wiki.org",
"5 org",
"1 intel.mail.com",
"951 com"
]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(N × L) | Each domain generates at most L subdomains |
| Space | O(K) | Hash map stores all unique subdomains |
Here:
Nis the number of input domainsLis the number of domain segmentsKis the number of unique subdomains
Because each domain contains at most three segments, L is extremely small. This makes the solution effectively linear in practice.
Test Cases
from collections import Counter
solution = Solution()
# Example 1
result = solution.subdomainVisits(["9001 discuss.leetcode.com"])
assert Counter(result) == Counter([
"9001 discuss.leetcode.com",
"9001 leetcode.com",
"9001 com"
])
# Example 2
result = solution.subdomainVisits([
"900 google.mail.com",
"50 yahoo.com",
"1 intel.mail.com",
"5 wiki.org"
])
assert Counter(result) == Counter([
"901 mail.com",
"50 yahoo.com",
"900 google.mail.com",
"5 wiki.org",
"5 org",
"1 intel.mail.com",
"951 com"
])
# Single top-level domain
result = solution.subdomainVisits(["10 com"])
assert Counter(result) == Counter([
"10 com"
]) # Tests already minimal domain
# Multiple entries contributing to same subdomain
result = solution.subdomainVisits([
"5 a.b.com",
"10 b.com"
])
assert Counter(result) == Counter([
"5 a.b.com",
"15 b.com",
"15 com"
]) # Tests count aggregation
# Duplicate exact domains
result = solution.subdomainVisits([
"3 x.y.com",
"7 x.y.com"
])
assert Counter(result) == Counter([
"10 x.y.com",
"10 y.com",
"10 com"
]) # Tests repeated identical domains
# Different top-level domains
result = solution.subdomainVisits([
"2 test.org",
"3 test.com"
])
assert Counter(result) == Counter([
"2 test.org",
"2 org",
"3 test.com",
"3 com"
]) # Tests separation of independent domains
# Minimum count value
result = solution.subdomainVisits(["1 a.com"])
assert Counter(result) == Counter([
"1 a.com",
"1 com"
]) # Tests smallest valid count
| Test | Why |
|---|---|
| Single example domain | Verifies basic subdomain generation |
| Multiple mixed domains | Verifies accumulation across shared parents |
| Top-level domain only | Ensures minimal domains work correctly |
| Shared parent domains | Confirms counts merge properly |
| Duplicate exact domains | Validates repeated inputs accumulate |
| Different TLDs | Ensures independent domains stay separate |
| Minimum count | Tests lower constraint boundary |
Edge Cases
Multiple domains sharing the same parent
One common source of bugs is failing to correctly accumulate counts from different domains that share a parent subdomain.
For example:
"900 google.mail.com"
"1 intel.mail.com"
Both contribute to mail.com and com.
A naive implementation might overwrite counts instead of adding them. Using a hash map with incremental updates ensures counts are accumulated correctly.
Domains with only one segment
A domain like:
"10 com"
already represents the top level domain.
Some implementations incorrectly assume every domain contains dots and may attempt invalid slicing operations. This implementation handles the case naturally because splitting "com" produces a single-element list, and the loop generates exactly one subdomain.
Duplicate exact domain entries
The same domain may appear multiple times in the input:
"3 x.y.com"
"7 x.y.com"
The algorithm must combine both occurrences into a single total.
Because every generated subdomain is stored in a hash map and updated with +=, repeated domains accumulate correctly without any additional handling.