LeetCode 734 - Sentence Similarity
The problem gives us two sentences, where each sentence is represented as an array of words. We are also given a list of word pairs that define which words are considered similar. Our task is to determine whether the two sentences are similar according to the following rules: 1.
Difficulty: 🟢 Easy
Topics: Array, Hash Table, String
Solution
Problem Understanding
The problem gives us two sentences, where each sentence is represented as an array of words. We are also given a list of word pairs that define which words are considered similar.
Our task is to determine whether the two sentences are similar according to the following rules:
- The sentences must contain the same number of words.
- At every index
i, the wordsentence1[i]must either:
- be exactly equal to
sentence2[i], or - appear as a similar pair in
similarPairs.
An important detail is that similarity is not transitive. If "great" is similar to "fine" and "fine" is similar to "excellent", that does not mean "great" is similar to "excellent".
Another important detail is that similarity is symmetric. If "great" is similar to "fine", then "fine" should also be considered similar to "great".
The constraints are relatively small:
- Up to 1000 words in each sentence
- Up to 1000 similarity pairs
These limits are small enough that even moderately inefficient solutions may pass, but we should still aim for an efficient implementation.
Several edge cases are important:
- The two sentences may have different lengths, in which case the answer is immediately
false. - A word is always similar to itself, even if it does not appear in
similarPairs. - The similarity list may be empty.
- Similarity is not transitive, so we must only check direct pair relationships.
- Similar pairs are distinct, so we do not need to handle duplicate entries.
Approaches
Brute Force Approach
A straightforward approach is to compare the two sentences word by word. For every pair of words, we scan through the entire similarPairs array to see whether the pair exists.
For each position i:
-
If the two words are identical, continue.
-
Otherwise, iterate through all pairs in
similarPairs. -
Check whether either:
-
(word1, word2)exists, or -
(word2, word1)exists. -
If no matching pair is found, return
false.
This approach is correct because it explicitly checks every allowed similarity relation. However, it is inefficient because for every word position we may scan the entire similarity list.
If:
nis the sentence lengthmis the number of similar pairs
then the time complexity becomes O(n * m).
With constraints up to 1000, this still passes comfortably, but we can do better.
Optimal Approach
The key observation is that repeated linear searches through similarPairs are unnecessary. We can preprocess all similarity relationships into a hash set for constant time lookup.
We store each pair in a set using tuples such as:
("great", "fine")
Since similarity is symmetric, we insert both directions:
("great", "fine")
("fine", "great")
Now, when comparing two words:
- If the words are equal, they are automatically similar.
- Otherwise, we simply check whether the tuple exists in the hash set.
Hash set lookup is O(1) on average, which reduces the total complexity to O(n + m).
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n * m) | O(1) | Scans every similarity pair for each word comparison |
| Optimal | O(n + m) | O(m) | Uses a hash set for constant time similarity lookup |
Algorithm Walkthrough
- First, check whether the two sentences have the same length.
If the lengths differ, the sentences cannot be similar according to the problem definition, so immediately return false.
2. Create a hash set to store similarity relationships.
For every pair [a, b] in similarPairs, insert:
(a, b)(b, a)
We insert both directions because similarity is symmetric. 3. Iterate through both sentences simultaneously.
For each index i:
- Let
word1 = sentence1[i] - Let
word2 = sentence2[i]
- Check whether the words are identical.
If word1 == word2, continue to the next position because a word is always similar to itself.
5. Otherwise, check whether (word1, word2) exists in the hash set.
If it does not exist, the words are not similar, so return false.
6. If all word pairs pass the similarity check, return true.
Why it works
The algorithm works because every valid similarity relationship is explicitly stored in the hash set. For every corresponding word pair in the two sentences, we verify one of the only two allowed conditions:
- the words are identical, or
- the pair exists in the similarity relation.
Since the problem explicitly states that similarity is not transitive, checking only direct relationships is exactly what the problem requires.
Python Solution
from typing import List
class Solution:
def areSentencesSimilar(
self,
sentence1: List[str],
sentence2: List[str],
similarPairs: List[List[str]]
) -> bool:
if len(sentence1) != len(sentence2):
return False
similar_set = set()
for word1, word2 in similarPairs:
similar_set.add((word1, word2))
similar_set.add((word2, word1))
for word1, word2 in zip(sentence1, sentence2):
if word1 == word2:
continue
if (word1, word2) not in similar_set:
return False
return True
The implementation begins by checking sentence lengths because differing lengths automatically invalidate similarity.
Next, we build a hash set containing all valid similarity relationships. We insert both directions of each pair to make lookups symmetric.
The main loop uses zip() to iterate through both sentences simultaneously. For each word pair:
- If the words are identical, we continue immediately.
- Otherwise, we check whether the pair exists in the hash set.
If any pair fails the check, the function immediately returns False. If the loop completes successfully, all word pairs are similar, so we return True.
Go Solution
func areSentencesSimilar(sentence1 []string, sentence2 []string, similarPairs [][]string) bool {
if len(sentence1) != len(sentence2) {
return false
}
similarSet := make(map[string]bool)
for _, pair := range similarPairs {
word1 := pair[0]
word2 := pair[1]
similarSet[word1+"#"+word2] = true
similarSet[word2+"#"+word1] = true
}
for i := 0; i < len(sentence1); i++ {
word1 := sentence1[i]
word2 := sentence2[i]
if word1 == word2 {
continue
}
key := word1 + "#" + word2
if !similarSet[key] {
return false
}
}
return true
}
In Go, tuples do not exist as conveniently as in Python, so we encode each pair into a single string key using a separator such as "#".
The Go version otherwise follows exactly the same logic:
- Check sentence lengths
- Store both similarity directions
- Compare corresponding words
- Return
falseimmediately on failure
Go maps naturally support constant time average lookup, making the performance equivalent to the Python solution.
Worked Examples
Example 1
sentence1 = ["great","acting","skills"]
sentence2 = ["fine","drama","talent"]
similarPairs = [
["great","fine"],
["drama","acting"],
["skills","talent"]
]
Step 1: Build similarity set
| Inserted Pair |
|---|
| ("great", "fine") |
| ("fine", "great") |
| ("drama", "acting") |
| ("acting", "drama") |
| ("skills", "talent") |
| ("talent", "skills") |
Step 2: Compare words
| Index | sentence1 | sentence2 | Similar? |
|---|---|---|---|
| 0 | great | fine | Yes |
| 1 | acting | drama | Yes |
| 2 | skills | talent | Yes |
All pairs are similar, so the answer is:
true
Example 2
sentence1 = ["great"]
sentence2 = ["great"]
similarPairs = []
Step 1: Length check
Both lengths are 1, so continue.
Step 2: Compare words
| Index | sentence1 | sentence2 | Equal? |
|---|---|---|---|
| 0 | great | great | Yes |
A word is always similar to itself.
Result:
true
Example 3
sentence1 = ["great"]
sentence2 = ["doubleplus", "good"]
Step 1: Length check
| sentence1 length | sentence2 length |
|---|---|
| 1 | 2 |
Lengths differ, so we immediately return:
false
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + m) | Building the hash set takes O(m), comparing sentence words takes O(n) |
| Space | O(m) | The hash set stores all similarity pairs |
The dominant operations are building the similarity lookup structure and iterating through the sentences once. Hash set operations are constant time on average, which allows the algorithm to scale linearly with input size.
Test Cases
from typing import List
class Solution:
def areSentencesSimilar(
self,
sentence1: List[str],
sentence2: List[str],
similarPairs: List[List[str]]
) -> bool:
if len(sentence1) != len(sentence2):
return False
similar_set = set()
for word1, word2 in similarPairs:
similar_set.add((word1, word2))
similar_set.add((word2, word1))
for word1, word2 in zip(sentence1, sentence2):
if word1 == word2:
continue
if (word1, word2) not in similar_set:
return False
return True
solution = Solution()
# Provided example 1
assert solution.areSentencesSimilar(
["great", "acting", "skills"],
["fine", "drama", "talent"],
[["great", "fine"], ["drama", "acting"], ["skills", "talent"]]
) is True
# Provided example 2
assert solution.areSentencesSimilar(
["great"],
["great"],
[]
) is True
# Provided example 3
assert solution.areSentencesSimilar(
["great"],
["doubleplus", "good"],
[["great", "doubleplus"]]
) is False
# Different lengths
assert solution.areSentencesSimilar(
["a", "b"],
["a"],
[]
) is False
# Direct similarity exists
assert solution.areSentencesSimilar(
["hello"],
["hi"],
[["hello", "hi"]]
) is True
# Reverse direction should work
assert solution.areSentencesSimilar(
["hi"],
["hello"],
[["hello", "hi"]]
) is True
# Non-transitive similarity
assert solution.areSentencesSimilar(
["a"],
["c"],
[["a", "b"], ["b", "c"]]
) is False
# Empty similarity list with unequal words
assert solution.areSentencesSimilar(
["one"],
["two"],
[]
) is False
# Multiple identical words
assert solution.areSentencesSimilar(
["same", "same"],
["same", "same"],
[]
) is True
# Mixed identical and similar words
assert solution.areSentencesSimilar(
["great", "movie"],
["fine", "movie"],
[["great", "fine"]]
) is True
| Test | Why |
|---|---|
| Example 1 | Validates normal similarity matching |
| Example 2 | Validates self similarity |
| Example 3 | Validates length mismatch handling |
| Different lengths | Confirms immediate rejection |
| Direct similarity exists | Tests standard pair lookup |
| Reverse direction should work | Verifies symmetry handling |
| Non-transitive similarity | Ensures transitivity is not assumed |
| Empty similarity list with unequal words | Confirms unrelated words fail |
| Multiple identical words | Tests repeated self similarity |
| Mixed identical and similar words | Tests combined scenarios |
Edge Cases
Different Sentence Lengths
If the two sentences contain different numbers of words, they cannot possibly be similar because similarity is defined position by position.
A common mistake is attempting to compare words until one sentence ends. That would incorrectly ignore unmatched trailing words. The implementation prevents this by checking lengths before any other processing.
Words Not Present in Similar Pairs
Some words may never appear in similarPairs. That does not automatically make them invalid because a word is always similar to itself.
For example:
sentence1 = ["leetcode"]
sentence2 = ["leetcode"]
Even with an empty similarity list, the answer should still be true.
The implementation handles this correctly using the equality check:
if word1 == word2:
continue
Non-Transitive Relationships
This is the most important logical edge case.
Suppose:
"a" ~ "b"
"b" ~ "c"
The problem explicitly states that "a" is not necessarily similar to "c".
A common mistake is using Union Find or graph connectivity, which would incorrectly treat similarity as transitive.
The implementation avoids this problem entirely by checking only direct pair membership in the hash set.