Difficulty: 🟡 Medium
Topics: Array, Hash Table, Two Pointers, String, Design

Solution

LeetCode 244 - Shortest Word Distance II

Problem Understanding

This problem asks us to design a reusable data structure that can efficiently answer repeated shortest-distance queries between words in a fixed list of strings.

We are given an array of words, wordsDict, during initialization. After that, we must repeatedly answer queries asking:

What is the minimum index distance between two different words in the original array?

The distance between two words is measured using the absolute difference between their positions in the array.

For example, consider:

["practice", "makes", "perfect", "coding", "makes"]

If we query:

shortest("coding", "practice")

The positions are:

"coding" → index 3
"practice" → index 0

Distance:

|3 - 0| = 3

If we query:

shortest("makes", "coding")

The word "makes" appears multiple times:

index 1, index 4

The word "coding" appears at:

index 3

Possible distances:

|1 - 3| = 2
|4 - 3| = 1

The shortest distance is 1.

The important observation is that the dictionary never changes after initialization. We are allowed to preprocess the input once and answer many future queries. This strongly suggests a preprocessing-based solution.

The constraints provide additional hints:

wordsDict.length <= 3 * 10^4
At most 5000 calls to shortest
word1 and word2 are guaranteed to exist
word1 != word2

A naive approach that scans the entire array for every query may still work within limits, but it is inefficient and ignores the reusable nature of the data structure. Since we may receive thousands of queries, preprocessing becomes worthwhile.

There are also several important guarantees that simplify implementation. We never need to handle missing words because the problem guarantees both query words exist in wordsDict. We also never need to handle identical words because word1 != word2.

However, repeated occurrences of words are a key challenge. A naive implementation may accidentally compare only the first occurrence and miss the actual shortest distance.

Approaches

Brute Force Approach

The most straightforward solution is to scan the entire wordsDict array every time shortest() is called.

For each query, we maintain the most recent positions of word1 and word2 while iterating through the array. Whenever we encounter either word, we update its index and compute the current distance if both words have already been seen.

This approach is correct because it examines every occurrence of both words and continuously updates the minimum distance.

However, this becomes inefficient when multiple queries are made. Each query requires traversing the entire array of length n.

If we receive q queries:

Total work = O(n × q)

With up to 5000 queries and 30000 words, this repeated scanning becomes wasteful.

Optimal Approach

The key insight is that the input array never changes.

Instead of rescanning the array for every query, we can preprocess the positions of every word once.

We build a hash map:

word -> list of indices

Example:

["practice", "makes", "perfect", "coding", "makes"]

Becomes:

{
    "practice": [0],
    "makes": [1, 4],
    "perfect": [2],
    "coding": [3]
}

Now, for a query like:

shortest("makes", "coding")

We only compare:

[1, 4]
and
[3]

Since both lists are naturally sorted by index, we can use the two-pointer technique to efficiently find the minimum absolute difference.

This works similarly to merging two sorted arrays.

At every step:

Compare the current positions
Update the minimum distance
Move the pointer with the smaller index

This avoids unnecessary comparisons and guarantees we eventually examine the best possible pair.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n) per query	O(1)	Scan the entire array for every query
Optimal	O(n) preprocessing + O(k + m) per query	O(n)	Store indices and use two pointers

Where:

n = total number of words
k = occurrences of word1
m = occurrences of word2

Algorithm Walkthrough

Optimal Algorithm

During initialization, create a hash map where each word maps to a list of all indices where it appears.

For example:

"makes" -> [1, 4]

We use a hash map because it provides constant-time lookup of the positions for any word. 2. Traverse the input array once.

For every index i and word wordsDict[i], append i to the corresponding list in the hash map. 3. When shortest(word1, word2) is called, retrieve the two sorted index lists from the hash map.

Example:

word1 = "makes"
word2 = "coding"

[1, 4]
[3]

Initialize two pointers:

i = 0
j = 0

These pointers track our current positions in the two lists. 5. Compare the indices at both pointers.

Compute:

abs(list1[i] - list2[j])

Update the minimum distance. 6. Move the pointer pointing to the smaller index.

If:

list1[i] < list2[j]

increment i.

Otherwise, increment j.

This step is important because moving the larger value cannot reduce the difference. To potentially find a smaller distance, we must advance the smaller index. 7. Continue until one pointer reaches the end of its list. 8. Return the smallest distance found.

Why it works

The correctness relies on the fact that both index lists are sorted.

At every step, the smaller index is the limiting factor. If we keep the smaller index fixed and move the larger one forward, the difference can only stay the same or increase. Therefore, the only chance of finding a smaller distance is to move the smaller pointer forward.

This greedy property guarantees that no optimal pair is skipped, and the algorithm always finds the true minimum distance.

Python Solution

from collections import defaultdict
from typing import List

class WordDistance:

    def __init__(self, wordsDict: List[str]):
        self.word_positions = defaultdict(list)

        for index, word in enumerate(wordsDict):
            self.word_positions[word].append(index)

    def shortest(self, word1: str, word2: str) -> int:
        positions1 = self.word_positions[word1]
        positions2 = self.word_positions[word2]

        pointer1 = 0
        pointer2 = 0
        minimum_distance = float("inf")

        while pointer1 < len(positions1) and pointer2 < len(positions2):
            index1 = positions1[pointer1]
            index2 = positions2[pointer2]

            minimum_distance = min(
                minimum_distance,
                abs(index1 - index2)
            )

            if index1 < index2:
                pointer1 += 1
            else:
                pointer2 += 1

        return minimum_distance

# Your WordDistance object will be instantiated and called as such:
# obj = WordDistance(wordsDict)
# param_1 = obj.shortest(word1, word2)

The implementation begins by preprocessing the dictionary during initialization. A defaultdict(list) stores every occurrence index for each word. Because we scan the array from left to right, indices are automatically stored in sorted order.

Inside shortest(), we retrieve the two index lists corresponding to the queried words.

We then use two pointers to traverse both sorted lists simultaneously. At each step, we calculate the current distance and update the minimum.

The pointer movement rule is critical. We advance whichever pointer references the smaller index because doing so is the only way to potentially reduce the difference.

Once either list is exhausted, every relevant comparison has been considered, and we return the minimum distance.

Go Solution

type WordDistance struct {
	wordPositions map[string][]int
}

func Constructor(wordsDict []string) WordDistance {
	wordPositions := make(map[string][]int)

	for index, word := range wordsDict {
		wordPositions[word] = append(
			wordPositions[word],
			index,
		)
	}

	return WordDistance{
		wordPositions: wordPositions,
	}
}

func (this *WordDistance) Shortest(word1 string, word2 string) int {
	positions1 := this.wordPositions[word1]
	positions2 := this.wordPositions[word2]

	pointer1 := 0
	pointer2 := 0
	minimumDistance := int(^uint(0) >> 1)

	for pointer1 < len(positions1) &&
		pointer2 < len(positions2) {

		index1 := positions1[pointer1]
		index2 := positions2[pointer2]

		distance := index1 - index2
		if distance < 0 {
			distance = -distance
		}

		if distance < minimumDistance {
			minimumDistance = distance
		}

		if index1 < index2 {
			pointer1++
		} else {
			pointer2++
		}
	}

	return minimumDistance
}

/**
 * Your WordDistance object will be instantiated and called as such:
 * obj := Constructor(wordsDict);
 * param_1 := obj.Shortest(word1, word2);
 */

The Go implementation follows the same logic as the Python version. Instead of a defaultdict, we use a map[string][]int.

Because Go does not have a built-in infinity value for integers, we initialize minimumDistance using the maximum possible integer:

int(^uint(0) >> 1)

We also manually compute absolute value since Go lacks a built-in integer abs() function.

Slices in Go naturally grow using append(), making them ideal for storing word positions.

Worked Examples

Example 1

Input:

["practice", "makes", "perfect", "coding", "makes"]

Preprocessed map:

Word	Indices
practice	[0]
makes	[1, 4]
perfect	[2]
coding	[3]

Query 1

shortest("coding", "practice")

Retrieved lists:

coding -> [3]
practice -> [0]

Step	Pointer1	Pointer2	Compared	Distance	Minimum
1	0	0	(3, 0)	3	3

Pointer for smaller value (0) moves.

Loop ends.

Result:

Query 2

shortest("makes", "coding")

Retrieved lists:

makes -> [1, 4]
coding -> [3]

Step	Pointer1	Pointer2	Compared	Distance	Minimum
1	0	0	(1, 3)	2	2
2	1	0	(4, 3)	1	1

The smaller pointer moves each time.

Final result:

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n) preprocessing + O(k + m) per query	Build map once, then traverse both occurrence lists
Space	O(n)	Store every word index

The preprocessing step scans the input array exactly once, requiring O(n) time.

For each query, the two-pointer algorithm traverses each occurrence list at most once. If word1 appears k times and word2 appears m times, the runtime is:

O(k + m)

The space complexity is O(n) because every index from the original array is stored exactly once in the hash map.

Test Cases

# Example case
obj = WordDistance(
    ["practice", "makes", "perfect", "coding", "makes"]
)
assert obj.shortest("coding", "practice") == 3  # Provided example
assert obj.shortest("makes", "coding") == 1  # Multiple occurrences

# Minimum input size
obj = WordDistance(["a", "b"])
assert obj.shortest("a", "b") == 1  # Smallest valid array

# Words appearing multiple times
obj = WordDistance(
    ["a", "x", "a", "x", "a"]
)
assert obj.shortest("a", "x") == 1  # Closest adjacent pair

# Words far apart
obj = WordDistance(
    ["a", "b", "c", "d", "e"]
)
assert obj.shortest("a", "e") == 4  # Maximum distance

# One word appears once
obj = WordDistance(
    ["apple", "banana", "apple", "orange"]
)
assert obj.shortest("banana", "orange") == 2  # Single occurrences

# Many repeated words
obj = WordDistance(
    ["a", "b", "a", "b", "a", "b"]
)
assert obj.shortest("a", "b") == 1  # Repeated alternating words

# Adjacent words
obj = WordDistance(
    ["cat", "dog", "mouse"]
)
assert obj.shortest("cat", "dog") == 1  # Immediate neighbors

# Different occurrence distributions
obj = WordDistance(
    ["x", "x", "x", "y", "z", "y"]
)
assert obj.shortest("x", "y") == 1  # Closest occurrence near boundary

Test	Why
Provided example	Verifies expected output
`["a", "b"]`	Tests smallest valid input
Multiple occurrences	Ensures closest pair is chosen
Far apart words	Verifies large distance handling
Single occurrence words	Tests simplest comparison
Alternating repeats	Stresses repeated occurrences
Adjacent words	Tests minimum possible distance
Uneven distributions	Ensures pointer logic works correctly

Edge Cases

Words Appearing Multiple Times

A major source of bugs is assuming each word appears only once.

Example:

["a", "x", "a", "x", "a"]

If we only compare the first occurrences:

|0 - 1| = 1

we may accidentally miss shorter combinations in other inputs. The implementation avoids this problem by storing all positions and using a two-pointer scan over every occurrence.

Very Frequent Queries

Since the problem allows up to 5000 calls to shortest(), repeatedly scanning the entire array would be inefficient.

A naive implementation would repeatedly perform O(n) work. Our preprocessing approach pays the setup cost once during initialization, then answers queries efficiently using precomputed index lists.

One Word Appearing Only Once

Some words may occur exactly once.

Example:

["apple", "banana", "orange"]

When querying:

shortest("banana", "orange")

each list contains only one element.

The two-pointer logic still works naturally because it handles single-element lists without requiring any special cases.

Closest Match at Different Positions

The shortest distance is not always formed by the first occurrence.

Example:

["a", "b", "c", "a", "b"]

For:

shortest("a", "b")

The best answer is:

|3 - 4| = 1

not:

|0 - 1| = 1

In more complicated examples, later occurrences may be strictly better. The two-pointer traversal guarantees every promising candidate pair is considered while avoiding unnecessary comparisons.