LeetCode 217 - Contains Duplicate

The problem gives an integer array nums and asks whether any number appears more than once. If at least one value is repeated, we return true. If every value appears exactly once, we return false. The input is a list of integers.

LeetCode Problem 217

Difficulty: 🟢 Easy
Topics: Array, Hash Table, Sorting

Solution

Problem Understanding

The problem gives an integer array nums and asks whether any number appears more than once. If at least one value is repeated, we return true. If every value appears exactly once, we return false.

The input is a list of integers. The output is a boolean value that answers a simple question: does the array contain duplicates?

For example, in [1,2,3,1], the value 1 appears twice, so the answer is true. In [1,2,3,4], every number is unique, so the answer is false.

The constraints are important:

  • The array can contain up to 10^5 elements.
  • Values can range from -10^9 to 10^9.

An array size of 100,000 means we should avoid algorithms that compare every pair of elements, because quadratic time complexity would become too slow. We need something close to linear time.

The problem guarantees that the array contains at least one element, so we never need to handle an empty input. Still, arrays with only one element are an important edge case because they can never contain duplicates.

Several cases can trip up naive implementations:

  • Duplicates appearing far apart in the array, such as [1,2,3,4,1]
  • Negative numbers, since values are not limited to positive integers
  • Arrays where every element is the same, such as [7,7,7,7]
  • Very large arrays, where inefficient solutions will time out

The key challenge is detecting repeated values efficiently.

Approaches

Brute Force Approach

The most direct solution is to compare every element with every other element.

For each index i, we check all later indices j > i. If nums[i] == nums[j], we immediately return true. If we finish all comparisons without finding a match, we return false.

This works because every possible pair is checked. If any duplicate exists, eventually we will compare the two equal values.

The problem is efficiency. With n elements, this approach performs roughly:

$$\frac{n(n-1)}{2}$$

comparisons in the worst case.

For n = 100,000, this becomes far too slow.

Optimal Approach, Hash Set

The key observation is that we do not need to repeatedly scan the array to check whether a value has appeared before. Instead, we can store previously seen values in a hash set.

A hash set provides average O(1) lookup time. As we iterate through the array:

  • If the current number is already in the set, we found a duplicate.
  • Otherwise, we add the number to the set and continue.

This reduces the total runtime to linear time because each element is processed once.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(1) Compares every pair of elements
Optimal O(n) O(n) Uses a hash set for constant-time lookups

Algorithm Walkthrough

Optimal Algorithm Using a Hash Set

  1. Create an empty hash set called seen.

The set will store all numbers we have already encountered while scanning the array. 2. Iterate through each number in nums.

We process elements one by one from left to right. 3. For the current number, check whether it already exists in seen.

If it does, this means we previously encountered the same value, so the array contains a duplicate. Return true. 4. If the number is not in the set, add it to seen.

This records that the number has now been encountered. 5. Continue until all elements have been processed.

If the loop finishes without finding any repeated value, every element is distinct. 6. Return false.

Why it works

The algorithm maintains the invariant that seen contains exactly the values encountered earlier in the traversal.

When processing a new number:

  • If it is already in seen, then the same value appeared before, so a duplicate exists.
  • If it is not in seen, then this is the first occurrence of the value.

Because every element is checked exactly once against all previously seen values, the algorithm correctly detects duplicates whenever they exist.

Python Solution

from typing import List

class Solution:
    def containsDuplicate(self, nums: List[int]) -> bool:
        seen = set()

        for number in nums:
            if number in seen:
                return True

            seen.add(number)

        return False

The implementation starts by creating an empty set named seen. Python sets are implemented using hash tables, which provide average constant-time insertion and lookup.

The loop processes each number in the input array. Before adding the current number to the set, the code checks whether it already exists in seen.

If the number is already present, the method immediately returns True because a duplicate has been found.

If the number is not present, it is added to the set so future iterations can detect duplicates involving that value.

If the loop completes without returning early, no duplicate exists, so the method returns False.

Go Solution

func containsDuplicate(nums []int) bool {
    seen := make(map[int]bool)

    for _, number := range nums {
        if seen[number] {
            return true
        }

        seen[number] = true
    }

    return false
}

Go does not have a built-in set type, so the implementation uses a map with boolean values to simulate a set.

The key is the integer from the array, and the value indicates whether the number has been seen before.

Unlike Python, Go arrays and slices are distinct types. The function accepts a slice []int, which is the standard representation used in LeetCode Go problems.

There are no integer overflow concerns because the algorithm only performs comparisons and hash lookups, not arithmetic operations.

Worked Examples

Example 1

Input:

nums = [1,2,3,1]

Processing steps:

Step Current Number Seen Set Before Duplicate Found? Seen Set After
1 1 {} No {1}
2 2 {1} No {1,2}
3 3 {1,2} No {1,2,3}
4 1 {1,2,3} Yes Return true

The second occurrence of 1 is detected immediately.

Example 2

Input:

nums = [1,2,3,4]

Processing steps:

Step Current Number Seen Set Before Duplicate Found? Seen Set After
1 1 {} No {1}
2 2 {1} No {1,2}
3 3 {1,2} No {1,2,3}
4 4 {1,2,3} No {1,2,3,4}

The loop finishes without finding duplicates, so the answer is false.

Example 3

Input:

nums = [1,1,1,3,3,4,3,2,4,2]

Processing steps:

Step Current Number Seen Set Before Duplicate Found?
1 1 {} No
2 1 {1} Yes

The algorithm stops immediately after finding the second 1.

Complexity Analysis

Measure Complexity Explanation
Time O(n) Each element is processed once
Space O(n) The hash set may store every unique element

The runtime is linear because each array element is checked and inserted into the hash set at most once. Hash table operations are average O(1).

The space complexity is linear in the worst case because if all values are unique, the set stores all n elements.

Test Cases

solution = Solution()

assert solution.containsDuplicate([1, 2, 3, 1]) is True       # duplicate exists
assert solution.containsDuplicate([1, 2, 3, 4]) is False      # all unique
assert solution.containsDuplicate([1, 1, 1, 1]) is True       # all identical
assert solution.containsDuplicate([42]) is False               # single element
assert solution.containsDuplicate([-1, -2, -3, -1]) is True   # negative duplicate
assert solution.containsDuplicate([0, 1, 2, 3, 0]) is True    # duplicate at ends
assert solution.containsDuplicate([5, 4, 3, 2, 1]) is False   # descending unique
assert solution.containsDuplicate([100000, -100000]) is False  # extreme values
assert solution.containsDuplicate([2, 14, 18, 22, 22]) is True # duplicate near end
assert solution.containsDuplicate(list(range(10000))) is False # large unique array

Test Summary

Test Why
[1,2,3,1] Basic duplicate detection
[1,2,3,4] Confirms unique arrays return false
[1,1,1,1] Tests repeated identical values
[42] Smallest valid input
[-1,-2,-3,-1] Ensures negative values work correctly
[0,1,2,3,0] Duplicate appears far apart
[5,4,3,2,1] Unique elements in reverse order
[100000,-100000] Large positive and negative values
[2,14,18,22,22] Duplicate near the end
range(10000) Stress test for performance

Edge Cases

Single Element Array

An array with only one element can never contain duplicates because there is no second occurrence possible.

A buggy implementation might incorrectly assume at least two elements exist and attempt invalid comparisons. The hash set approach handles this naturally. The loop runs once, adds the element to the set, and returns False.

Duplicate Appears Late

Consider an input like [1,2,3,4,5,1].

A naive implementation that incorrectly stops early or fails to track all previous elements might miss the duplicate because the repeated value is far from its first occurrence.

The hash set solution correctly stores every previously seen value, so the final 1 is detected immediately.

All Elements Identical

For input [7,7,7,7], duplicates appear immediately.

Some inefficient implementations may continue scanning even after finding a duplicate, wasting time unnecessarily.

This implementation returns True as soon as the second 7 is encountered, providing an early exit and optimal behavior.