LeetCode 3839 - Number of Prefix Connected Groups

The problem gives us an array of strings words and an integer k. Two words are considered prefix-connected if they share exactly the same first k characters and they come from different indices in the array.

LeetCode Problem 3839

Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Counting

Solution

LeetCode 3839: Number of Prefix Connected Groups

Problem Understanding

The problem gives us an array of strings words and an integer k. Two words are considered prefix-connected if they share exactly the same first k characters and they come from different indices in the array.

A connected group is defined as a set of words where every pair of words is prefix-connected. Since prefix-connectedness is determined solely by the first k characters, this means that every word in a group must have the same length-k prefix.

Our task is to count how many such groups contain at least two words.

An important detail is that words whose length is less than k cannot participate in any group because they do not even have a prefix of length k. These words must be ignored completely.

Duplicate strings are treated as separate words because the problem considers array positions, not unique string values. For example, two occurrences of "dog" at different indices count as two distinct words and can form a valid group.

The constraints are:

  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 100
  • 1 <= k <= 100

These constraints indicate that we can comfortably process every word once. The number of words is relatively small, and extracting prefixes of length at most 100 is inexpensive.

Several edge cases are worth noting:

  • All words may be shorter than k, resulting in no groups.
  • A prefix may appear only once, which does not count as a valid group.
  • Multiple identical words must still be counted separately.
  • Every word may belong to the same prefix group.
  • k may equal the full length of some words.

Approaches

Brute Force

A straightforward approach is to compare every pair of words.

For each pair (i, j), we can check whether both words have length at least k and whether their first k characters match. This creates a graph where words are vertices and matching prefixes create edges.

After building the graph, we could find connected components and count those whose size is at least two.

This approach is correct because it explicitly checks the definition of prefix-connectedness. However, it is inefficient because there are O(n²) pairs of words. With up to 5000 words, this could require roughly 25 million comparisons.

Key Insight

The definition of a group is much simpler than it initially appears.

If two words belong to the same group, they must share the same first k characters. Therefore, every valid group corresponds exactly to a unique length-k prefix.

Instead of building a graph, we can simply:

  1. Ignore words shorter than k.
  2. Extract the first k characters from every remaining word.
  3. Count how many times each prefix appears.
  4. Count how many prefixes occur at least twice.

Each prefix with frequency at least two represents exactly one connected group.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n² · k) O(n²) Compare every pair and build connectivity information
Optimal O(n · k) O(n) Count occurrences of each length-k prefix

Algorithm Walkthrough

  1. Create a hash map called prefix_count.

The key will be a length-k prefix and the value will be the number of words having that prefix. 2. Iterate through every word in words.

For each word, first check whether its length is at least k. If not, skip it because it cannot belong to any group. 3. Extract the prefix.

Compute word[:k] and use it as the key in the hash map. 4. Update the frequency.

Increment the count associated with that prefix. 5. Count valid groups.

After processing all words, iterate through the frequencies stored in the hash map. 6. For every prefix whose count is at least 2, add one to the answer.

Such a prefix represents one connected group containing at least two words. 7. Return the final count.

Why it works

The crucial observation is that prefix-connectedness is completely determined by the first k characters. All words sharing the same length-k prefix form exactly one maximal connected group, while words with different prefixes can never belong to the same group. Therefore, counting prefixes that appear at least twice is equivalent to counting connected groups of size at least two.

Python Solution

from typing import List
from collections import defaultdict

class Solution:
    def prefixConnected(self, words: List[str], k: int) -> int:
        prefix_count = defaultdict(int)

        for word in words:
            if len(word) < k:
                continue

            prefix = word[:k]
            prefix_count[prefix] += 1

        groups = 0

        for count in prefix_count.values():
            if count >= 2:
                groups += 1

        return groups

The implementation follows the algorithm directly. A hash map stores the frequency of each valid length-k prefix. Words shorter than k are skipped immediately because they cannot contribute to any group. After all frequencies have been collected, we count how many prefixes appear at least twice. Each such prefix corresponds to one connected group, so the resulting count is returned.

Go Solution

func prefixConnected(words []string, k int) int {
	prefixCount := make(map[string]int)

	for _, word := range words {
		if len(word) < k {
			continue
		}

		prefix := word[:k]
		prefixCount[prefix]++
	}

	groups := 0

	for _, count := range prefixCount {
		if count >= 2 {
			groups++
		}
	}

	return groups
}

The Go implementation mirrors the Python solution. A map[string]int stores prefix frequencies. String slicing with word[:k] extracts the first k characters. After all counts are collected, the map is scanned to count prefixes whose frequency is at least two.

No special handling for integer overflow is required because the maximum count is at most 5000, which easily fits in Go's int type.

Worked Examples

Example 1

Input

words = ["apple","apply","banana","bandit"]
k = 2

Building the Frequency Map

Word Length >= k? Prefix Map State
apple Yes ap {"ap": 1}
apply Yes ap {"ap": 2}
banana Yes ba {"ap": 2, "ba": 1}
bandit Yes ba {"ap": 2, "ba": 2}

Counting Groups

Prefix Count Valid Group?
ap 2 Yes
ba 2 Yes

Answer = 2.

Example 2

Input

words = ["car","cat","cartoon"]
k = 3

Building the Frequency Map

Word Prefix Map State
car car {"car": 1}
cat cat {"car": 1, "cat": 1}
cartoon car {"car": 2, "cat": 1}

Counting Groups

Prefix Count Valid Group?
car 2 Yes
cat 1 No

Answer = 1.

Example 3

Input

words = ["bat","dog","dog","doggy","bat"]
k = 3

Building the Frequency Map

Word Prefix Map State
bat bat {"bat": 1}
dog dog {"bat": 1, "dog": 1}
dog dog {"bat": 1, "dog": 2}
doggy dog {"bat": 1, "dog": 3}
bat bat {"bat": 2, "dog": 3}

Counting Groups

Prefix Count Valid Group?
bat 2 Yes
dog 3 Yes

Answer = 2.

Complexity Analysis

Measure Complexity Explanation
Time O(n · k) Each word contributes one prefix extraction of length at most k
Space O(n) In the worst case every valid word has a distinct prefix

The algorithm processes each word exactly once. Extracting a prefix of length k requires O(k) work, so the total running time is O(n · k). The hash map may contain up to one entry per valid word, leading to O(n) auxiliary space.

Test Cases

from typing import List

s = Solution()

assert s.prefixConnected(
    ["apple", "apply", "banana", "bandit"], 2
) == 2  # two separate prefix groups

assert s.prefixConnected(
    ["car", "cat", "cartoon"], 3
) == 1  # only "car" prefix appears twice

assert s.prefixConnected(
    ["bat", "dog", "dog", "doggy", "bat"], 3
) == 2  # duplicate words count separately

assert s.prefixConnected(
    ["a", "b", "c"], 2
) == 0  # all words shorter than k

assert s.prefixConnected(
    ["same", "same", "same"], 4
) == 1  # one large group

assert s.prefixConnected(
    ["abc", "abd", "abe"], 2
) == 1  # all share prefix "ab"

assert s.prefixConnected(
    ["abc", "def", "ghi"], 1
) == 0  # every prefix unique

assert s.prefixConnected(
    ["aa", "aa", "ab", "ab", "ac"], 2
) == 2  # two valid groups

assert s.prefixConnected(
    ["x"], 1
) == 0  # single word cannot form a group

assert s.prefixConnected(
    ["abcd", "abce", "abcf", "xyz"], 3
) == 1  # one large group and one singleton

Test Summary

Test Why
["apple","apply","banana","bandit"], k=2 Verifies the first example
["car","cat","cartoon"], k=3 Verifies the second example
["bat","dog","dog","doggy","bat"], k=3 Verifies duplicate handling
["a","b","c"], k=2 All words are too short
["same","same","same"], k=4 Single large group
["abc","abd","abe"], k=2 Multiple words sharing one prefix
["abc","def","ghi"], k=1 No valid groups
["aa","aa","ab","ab","ac"], k=2 Multiple independent groups
["x"], k=1 Minimum size input
["abcd","abce","abcf","xyz"], k=3 One group plus singleton

Edge Cases

Words Shorter Than k

A common mistake is attempting to extract a length-k prefix from every word without checking its length first. Such words are explicitly excluded by the problem statement. The implementation handles this by skipping any word whose length is less than k.

Duplicate Strings

Some solutions may accidentally deduplicate strings before processing them. That would be incorrect because the problem treats different indices as different words. If "dog" appears three times, all three occurrences contribute to the same group. The frequency map naturally handles this by counting every occurrence.

Prefix Appears Only Once

A prefix that occurs exactly once does not form a connected group because a valid group must contain at least two words. The implementation checks count >= 2 when computing the final answer, ensuring singleton prefixes are ignored.

All Words Share the Same Prefix

If every valid word begins with the same length-k prefix, there is still only one connected group regardless of how many words exist. The algorithm correctly records one prefix entry with a large frequency and counts it as exactly one group.

k Equals the Entire Word Length

When k matches a word's length, the whole word becomes its prefix. Two words belong to the same group only if they are identical. The prefix extraction logic works naturally in this situation because word[:k] returns the entire string.