LeetCode 217 - Contains Duplicate
The problem gives an integer array nums and asks whether any number appears more than once. If at least one value is repeated, we return true. If every value appears exactly once, we return false. The input is a list of integers.
Difficulty: 🟢 Easy
Topics: Array, Hash Table, Sorting
Solution
Problem Understanding
The problem gives an integer array nums and asks whether any number appears more than once. If at least one value is repeated, we return true. If every value appears exactly once, we return false.
The input is a list of integers. The output is a boolean value that answers a simple question: does the array contain duplicates?
For example, in [1,2,3,1], the value 1 appears twice, so the answer is true. In [1,2,3,4], every number is unique, so the answer is false.
The constraints are important:
- The array can contain up to
10^5elements. - Values can range from
-10^9to10^9.
An array size of 100,000 means we should avoid algorithms that compare every pair of elements, because quadratic time complexity would become too slow. We need something close to linear time.
The problem guarantees that the array contains at least one element, so we never need to handle an empty input. Still, arrays with only one element are an important edge case because they can never contain duplicates.
Several cases can trip up naive implementations:
- Duplicates appearing far apart in the array, such as
[1,2,3,4,1] - Negative numbers, since values are not limited to positive integers
- Arrays where every element is the same, such as
[7,7,7,7] - Very large arrays, where inefficient solutions will time out
The key challenge is detecting repeated values efficiently.
Approaches
Brute Force Approach
The most direct solution is to compare every element with every other element.
For each index i, we check all later indices j > i. If nums[i] == nums[j], we immediately return true. If we finish all comparisons without finding a match, we return false.
This works because every possible pair is checked. If any duplicate exists, eventually we will compare the two equal values.
The problem is efficiency. With n elements, this approach performs roughly:
$$\frac{n(n-1)}{2}$$
comparisons in the worst case.
For n = 100,000, this becomes far too slow.
Optimal Approach, Hash Set
The key observation is that we do not need to repeatedly scan the array to check whether a value has appeared before. Instead, we can store previously seen values in a hash set.
A hash set provides average O(1) lookup time. As we iterate through the array:
- If the current number is already in the set, we found a duplicate.
- Otherwise, we add the number to the set and continue.
This reduces the total runtime to linear time because each element is processed once.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(1) | Compares every pair of elements |
| Optimal | O(n) | O(n) | Uses a hash set for constant-time lookups |
Algorithm Walkthrough
Optimal Algorithm Using a Hash Set
- Create an empty hash set called
seen.
The set will store all numbers we have already encountered while scanning the array.
2. Iterate through each number in nums.
We process elements one by one from left to right.
3. For the current number, check whether it already exists in seen.
If it does, this means we previously encountered the same value, so the array contains a duplicate. Return true.
4. If the number is not in the set, add it to seen.
This records that the number has now been encountered. 5. Continue until all elements have been processed.
If the loop finishes without finding any repeated value, every element is distinct.
6. Return false.
Why it works
The algorithm maintains the invariant that seen contains exactly the values encountered earlier in the traversal.
When processing a new number:
- If it is already in
seen, then the same value appeared before, so a duplicate exists. - If it is not in
seen, then this is the first occurrence of the value.
Because every element is checked exactly once against all previously seen values, the algorithm correctly detects duplicates whenever they exist.
Python Solution
from typing import List
class Solution:
def containsDuplicate(self, nums: List[int]) -> bool:
seen = set()
for number in nums:
if number in seen:
return True
seen.add(number)
return False
The implementation starts by creating an empty set named seen. Python sets are implemented using hash tables, which provide average constant-time insertion and lookup.
The loop processes each number in the input array. Before adding the current number to the set, the code checks whether it already exists in seen.
If the number is already present, the method immediately returns True because a duplicate has been found.
If the number is not present, it is added to the set so future iterations can detect duplicates involving that value.
If the loop completes without returning early, no duplicate exists, so the method returns False.
Go Solution
func containsDuplicate(nums []int) bool {
seen := make(map[int]bool)
for _, number := range nums {
if seen[number] {
return true
}
seen[number] = true
}
return false
}
Go does not have a built-in set type, so the implementation uses a map with boolean values to simulate a set.
The key is the integer from the array, and the value indicates whether the number has been seen before.
Unlike Python, Go arrays and slices are distinct types. The function accepts a slice []int, which is the standard representation used in LeetCode Go problems.
There are no integer overflow concerns because the algorithm only performs comparisons and hash lookups, not arithmetic operations.
Worked Examples
Example 1
Input:
nums = [1,2,3,1]
Processing steps:
| Step | Current Number | Seen Set Before | Duplicate Found? | Seen Set After |
|---|---|---|---|---|
| 1 | 1 | {} | No | {1} |
| 2 | 2 | {1} | No | {1,2} |
| 3 | 3 | {1,2} | No | {1,2,3} |
| 4 | 1 | {1,2,3} | Yes | Return true |
The second occurrence of 1 is detected immediately.
Example 2
Input:
nums = [1,2,3,4]
Processing steps:
| Step | Current Number | Seen Set Before | Duplicate Found? | Seen Set After |
|---|---|---|---|---|
| 1 | 1 | {} | No | {1} |
| 2 | 2 | {1} | No | {1,2} |
| 3 | 3 | {1,2} | No | {1,2,3} |
| 4 | 4 | {1,2,3} | No | {1,2,3,4} |
The loop finishes without finding duplicates, so the answer is false.
Example 3
Input:
nums = [1,1,1,3,3,4,3,2,4,2]
Processing steps:
| Step | Current Number | Seen Set Before | Duplicate Found? |
|---|---|---|---|
| 1 | 1 | {} | No |
| 2 | 1 | {1} | Yes |
The algorithm stops immediately after finding the second 1.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Each element is processed once |
| Space | O(n) | The hash set may store every unique element |
The runtime is linear because each array element is checked and inserted into the hash set at most once. Hash table operations are average O(1).
The space complexity is linear in the worst case because if all values are unique, the set stores all n elements.
Test Cases
solution = Solution()
assert solution.containsDuplicate([1, 2, 3, 1]) is True # duplicate exists
assert solution.containsDuplicate([1, 2, 3, 4]) is False # all unique
assert solution.containsDuplicate([1, 1, 1, 1]) is True # all identical
assert solution.containsDuplicate([42]) is False # single element
assert solution.containsDuplicate([-1, -2, -3, -1]) is True # negative duplicate
assert solution.containsDuplicate([0, 1, 2, 3, 0]) is True # duplicate at ends
assert solution.containsDuplicate([5, 4, 3, 2, 1]) is False # descending unique
assert solution.containsDuplicate([100000, -100000]) is False # extreme values
assert solution.containsDuplicate([2, 14, 18, 22, 22]) is True # duplicate near end
assert solution.containsDuplicate(list(range(10000))) is False # large unique array
Test Summary
| Test | Why |
|---|---|
[1,2,3,1] |
Basic duplicate detection |
[1,2,3,4] |
Confirms unique arrays return false |
[1,1,1,1] |
Tests repeated identical values |
[42] |
Smallest valid input |
[-1,-2,-3,-1] |
Ensures negative values work correctly |
[0,1,2,3,0] |
Duplicate appears far apart |
[5,4,3,2,1] |
Unique elements in reverse order |
[100000,-100000] |
Large positive and negative values |
[2,14,18,22,22] |
Duplicate near the end |
range(10000) |
Stress test for performance |
Edge Cases
Single Element Array
An array with only one element can never contain duplicates because there is no second occurrence possible.
A buggy implementation might incorrectly assume at least two elements exist and attempt invalid comparisons. The hash set approach handles this naturally. The loop runs once, adds the element to the set, and returns False.
Duplicate Appears Late
Consider an input like [1,2,3,4,5,1].
A naive implementation that incorrectly stops early or fails to track all previous elements might miss the duplicate because the repeated value is far from its first occurrence.
The hash set solution correctly stores every previously seen value, so the final 1 is detected immediately.
All Elements Identical
For input [7,7,7,7], duplicates appear immediately.
Some inefficient implementations may continue scanning even after finding a duplicate, wasting time unnecessarily.
This implementation returns True as soon as the second 7 is encountered, providing an early exit and optimal behavior.