LeetCode 3839 - Number of Prefix Connected Groups
The problem gives us an array of strings words and an integer k. Two words are considered prefix-connected if they share exactly the same first k characters and they come from different indices in the array.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Counting
Solution
LeetCode 3839: Number of Prefix Connected Groups
Problem Understanding
The problem gives us an array of strings words and an integer k. Two words are considered prefix-connected if they share exactly the same first k characters and they come from different indices in the array.
A connected group is defined as a set of words where every pair of words is prefix-connected. Since prefix-connectedness is determined solely by the first k characters, this means that every word in a group must have the same length-k prefix.
Our task is to count how many such groups contain at least two words.
An important detail is that words whose length is less than k cannot participate in any group because they do not even have a prefix of length k. These words must be ignored completely.
Duplicate strings are treated as separate words because the problem considers array positions, not unique string values. For example, two occurrences of "dog" at different indices count as two distinct words and can form a valid group.
The constraints are:
1 <= words.length <= 50001 <= words[i].length <= 1001 <= k <= 100
These constraints indicate that we can comfortably process every word once. The number of words is relatively small, and extracting prefixes of length at most 100 is inexpensive.
Several edge cases are worth noting:
- All words may be shorter than
k, resulting in no groups. - A prefix may appear only once, which does not count as a valid group.
- Multiple identical words must still be counted separately.
- Every word may belong to the same prefix group.
kmay equal the full length of some words.
Approaches
Brute Force
A straightforward approach is to compare every pair of words.
For each pair (i, j), we can check whether both words have length at least k and whether their first k characters match. This creates a graph where words are vertices and matching prefixes create edges.
After building the graph, we could find connected components and count those whose size is at least two.
This approach is correct because it explicitly checks the definition of prefix-connectedness. However, it is inefficient because there are O(n²) pairs of words. With up to 5000 words, this could require roughly 25 million comparisons.
Key Insight
The definition of a group is much simpler than it initially appears.
If two words belong to the same group, they must share the same first k characters. Therefore, every valid group corresponds exactly to a unique length-k prefix.
Instead of building a graph, we can simply:
- Ignore words shorter than
k. - Extract the first
kcharacters from every remaining word. - Count how many times each prefix appears.
- Count how many prefixes occur at least twice.
Each prefix with frequency at least two represents exactly one connected group.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n² · k) | O(n²) | Compare every pair and build connectivity information |
| Optimal | O(n · k) | O(n) | Count occurrences of each length-k prefix |
Algorithm Walkthrough
- Create a hash map called
prefix_count.
The key will be a length-k prefix and the value will be the number of words having that prefix.
2. Iterate through every word in words.
For each word, first check whether its length is at least k. If not, skip it because it cannot belong to any group.
3. Extract the prefix.
Compute word[:k] and use it as the key in the hash map.
4. Update the frequency.
Increment the count associated with that prefix. 5. Count valid groups.
After processing all words, iterate through the frequencies stored in the hash map.
6. For every prefix whose count is at least 2, add one to the answer.
Such a prefix represents one connected group containing at least two words. 7. Return the final count.
Why it works
The crucial observation is that prefix-connectedness is completely determined by the first k characters. All words sharing the same length-k prefix form exactly one maximal connected group, while words with different prefixes can never belong to the same group. Therefore, counting prefixes that appear at least twice is equivalent to counting connected groups of size at least two.
Python Solution
from typing import List
from collections import defaultdict
class Solution:
def prefixConnected(self, words: List[str], k: int) -> int:
prefix_count = defaultdict(int)
for word in words:
if len(word) < k:
continue
prefix = word[:k]
prefix_count[prefix] += 1
groups = 0
for count in prefix_count.values():
if count >= 2:
groups += 1
return groups
The implementation follows the algorithm directly. A hash map stores the frequency of each valid length-k prefix. Words shorter than k are skipped immediately because they cannot contribute to any group. After all frequencies have been collected, we count how many prefixes appear at least twice. Each such prefix corresponds to one connected group, so the resulting count is returned.
Go Solution
func prefixConnected(words []string, k int) int {
prefixCount := make(map[string]int)
for _, word := range words {
if len(word) < k {
continue
}
prefix := word[:k]
prefixCount[prefix]++
}
groups := 0
for _, count := range prefixCount {
if count >= 2 {
groups++
}
}
return groups
}
The Go implementation mirrors the Python solution. A map[string]int stores prefix frequencies. String slicing with word[:k] extracts the first k characters. After all counts are collected, the map is scanned to count prefixes whose frequency is at least two.
No special handling for integer overflow is required because the maximum count is at most 5000, which easily fits in Go's int type.
Worked Examples
Example 1
Input
words = ["apple","apply","banana","bandit"]
k = 2
Building the Frequency Map
| Word | Length >= k? | Prefix | Map State |
|---|---|---|---|
| apple | Yes | ap | {"ap": 1} |
| apply | Yes | ap | {"ap": 2} |
| banana | Yes | ba | {"ap": 2, "ba": 1} |
| bandit | Yes | ba | {"ap": 2, "ba": 2} |
Counting Groups
| Prefix | Count | Valid Group? |
|---|---|---|
| ap | 2 | Yes |
| ba | 2 | Yes |
Answer = 2.
Example 2
Input
words = ["car","cat","cartoon"]
k = 3
Building the Frequency Map
| Word | Prefix | Map State |
|---|---|---|
| car | car | {"car": 1} |
| cat | cat | {"car": 1, "cat": 1} |
| cartoon | car | {"car": 2, "cat": 1} |
Counting Groups
| Prefix | Count | Valid Group? |
|---|---|---|
| car | 2 | Yes |
| cat | 1 | No |
Answer = 1.
Example 3
Input
words = ["bat","dog","dog","doggy","bat"]
k = 3
Building the Frequency Map
| Word | Prefix | Map State |
|---|---|---|
| bat | bat | {"bat": 1} |
| dog | dog | {"bat": 1, "dog": 1} |
| dog | dog | {"bat": 1, "dog": 2} |
| doggy | dog | {"bat": 1, "dog": 3} |
| bat | bat | {"bat": 2, "dog": 3} |
Counting Groups
| Prefix | Count | Valid Group? |
|---|---|---|
| bat | 2 | Yes |
| dog | 3 | Yes |
Answer = 2.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n · k) | Each word contributes one prefix extraction of length at most k |
| Space | O(n) | In the worst case every valid word has a distinct prefix |
The algorithm processes each word exactly once. Extracting a prefix of length k requires O(k) work, so the total running time is O(n · k). The hash map may contain up to one entry per valid word, leading to O(n) auxiliary space.
Test Cases
from typing import List
s = Solution()
assert s.prefixConnected(
["apple", "apply", "banana", "bandit"], 2
) == 2 # two separate prefix groups
assert s.prefixConnected(
["car", "cat", "cartoon"], 3
) == 1 # only "car" prefix appears twice
assert s.prefixConnected(
["bat", "dog", "dog", "doggy", "bat"], 3
) == 2 # duplicate words count separately
assert s.prefixConnected(
["a", "b", "c"], 2
) == 0 # all words shorter than k
assert s.prefixConnected(
["same", "same", "same"], 4
) == 1 # one large group
assert s.prefixConnected(
["abc", "abd", "abe"], 2
) == 1 # all share prefix "ab"
assert s.prefixConnected(
["abc", "def", "ghi"], 1
) == 0 # every prefix unique
assert s.prefixConnected(
["aa", "aa", "ab", "ab", "ac"], 2
) == 2 # two valid groups
assert s.prefixConnected(
["x"], 1
) == 0 # single word cannot form a group
assert s.prefixConnected(
["abcd", "abce", "abcf", "xyz"], 3
) == 1 # one large group and one singleton
Test Summary
| Test | Why |
|---|---|
["apple","apply","banana","bandit"], k=2 |
Verifies the first example |
["car","cat","cartoon"], k=3 |
Verifies the second example |
["bat","dog","dog","doggy","bat"], k=3 |
Verifies duplicate handling |
["a","b","c"], k=2 |
All words are too short |
["same","same","same"], k=4 |
Single large group |
["abc","abd","abe"], k=2 |
Multiple words sharing one prefix |
["abc","def","ghi"], k=1 |
No valid groups |
["aa","aa","ab","ab","ac"], k=2 |
Multiple independent groups |
["x"], k=1 |
Minimum size input |
["abcd","abce","abcf","xyz"], k=3 |
One group plus singleton |
Edge Cases
Words Shorter Than k
A common mistake is attempting to extract a length-k prefix from every word without checking its length first. Such words are explicitly excluded by the problem statement. The implementation handles this by skipping any word whose length is less than k.
Duplicate Strings
Some solutions may accidentally deduplicate strings before processing them. That would be incorrect because the problem treats different indices as different words. If "dog" appears three times, all three occurrences contribute to the same group. The frequency map naturally handles this by counting every occurrence.
Prefix Appears Only Once
A prefix that occurs exactly once does not form a connected group because a valid group must contain at least two words. The implementation checks count >= 2 when computing the final answer, ensuring singleton prefixes are ignored.
All Words Share the Same Prefix
If every valid word begins with the same length-k prefix, there is still only one connected group regardless of how many words exist. The algorithm correctly records one prefix entry with a large frequency and counts it as exactly one group.
k Equals the Entire Word Length
When k matches a word's length, the whole word becomes its prefix. Two words belong to the same group only if they are identical. The prefix extraction logic works naturally in this situation because word[:k] returns the entire string.