LeetCode 442 - Find All Duplicates in an Array

Difficulty: 🟡 Medium
Topics: Array, Hash Table, Sorting

Solution

Problem Understanding

The problem gives us an integer array nums of length n, where every value is guaranteed to be in the range [1, n]. Each number appears either once or twice. Our task is to return all numbers that appear exactly twice.

The important requirement is not just correctness, but efficiency. The solution must run in O(n) time and use only constant auxiliary space, excluding the output array. This restriction immediately rules out many straightforward solutions such as using a hash set or frequency map, because those approaches require additional memory proportional to the size of the input.

The constraints reveal a very important property of the input. Since every number is between 1 and n, every value can be mapped directly to a valid index in the array. Specifically, the value x corresponds naturally to index x - 1. This relationship is the key insight that enables the optimal in place solution.

The input array can contain duplicates, but each number appears at most twice. That guarantee simplifies the logic because we never need to track counts larger than two. If a number is encountered a second time, we know immediately that it belongs in the result.

Several edge cases are important to recognize early. Arrays with no duplicates should return an empty list. Arrays where every duplicated value appears consecutively still need to work correctly. Arrays of size 1 must also be handled safely. Another subtle point is that the optimal solution modifies the array temporarily, so we must ensure the logic still works even after values become negative.

Approaches

Brute Force Approach

The most straightforward solution is to check every element against every other element. For each number, we scan the rest of the array to see whether the same value appears again.

This approach works because it exhaustively compares all pairs of elements. Whenever we find another occurrence of the same number, we add it to the result.

However, this method is inefficient. For an array of size n, each element may require scanning almost the entire remaining array. That leads to O(n^2) time complexity, which becomes too slow when n can be as large as 10^5.

Another simple alternative is using a hash set or dictionary to count frequencies. That improves the runtime to O(n), but it requires O(n) extra space, which violates the constant space requirement.

Optimal Approach

The key observation is that all values are in the range [1, n]. That means every number can be mapped to an index in the array.

We use the input array itself as a bookkeeping structure. When we encounter a number x, we look at index x - 1.

If the value at that index is positive, it means we are seeing x for the first time, so we negate the value at that index to mark it as visited.

If the value at that index is already negative, it means we have visited x before. Therefore, x is a duplicate, and we add it to the result.

This technique works in place and requires only constant extra memory because we reuse the original array for tracking visits.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	`O(n^2)`	`O(1)`	Compare every pair of elements
Optimal	`O(n)`	`O(1)`	Uses index marking with negative signs

Algorithm Walkthrough

Initialize an empty result list that will store all duplicate numbers.
Iterate through each number in the array.
Since numbers may already have been negated during earlier steps, take the absolute value of the current number. This ensures we recover the original value.
Convert the number into an index using index = abs(num) - 1. Because values are guaranteed to be between 1 and n, this index will always be valid.
Check the value stored at nums[index].
If nums[index] is positive, it means this is the first time we have encountered this number. Negate the value at that index to mark it as visited.
If nums[index] is already negative, it means we have seen this number before. Therefore, the current number is a duplicate, so append it to the result list.
Continue until the entire array has been processed.
Return the result list.

Why it works

The algorithm relies on the invariant that the sign of the value at index x - 1 indicates whether the number x has already appeared.

A positive value means the number has not been seen yet. A negative value means it has already been encountered once. Since each number appears at most twice, the second encounter uniquely identifies a duplicate.

Because every value maps to a unique index, there is no ambiguity or collision in the marking process.

Python Solution

from typing import List

class Solution:
    def findDuplicates(self, nums: List[int]) -> List[int]:
        duplicates = []

        for num in nums:
            index = abs(num) - 1

            if nums[index] < 0:
                duplicates.append(abs(num))
            else:
                nums[index] *= -1

        return duplicates

The implementation begins by creating an empty list called duplicates to store numbers that appear twice.

The loop iterates through every value in nums. Because earlier operations may have negated some elements, we use abs(num) to recover the original number.

The expression abs(num) - 1 converts the number into its corresponding array index. We then inspect the value at that location.

If the value is already negative, the number has been seen before, so we append it to the result list.

Otherwise, we negate the value at that index to mark the number as visited.

At the end of the traversal, the duplicates list contains all numbers that appeared twice.

Go Solution

func findDuplicates(nums []int) []int {
    duplicates := []int{}

    for _, num := range nums {
        if num < 0 {
            num = -num
        }

        index := num - 1

        if nums[index] < 0 {
            duplicates = append(duplicates, num)
        } else {
            nums[index] *= -1
        }
    }

    return duplicates
}

The Go implementation follows the same logic as the Python version. Since Go does not provide a built in abs function for integers in the standard library without importing additional packages, the code manually converts negative values back to positive.

Slices in Go behave similarly to dynamic arrays, so appending duplicates works naturally. Returning an empty slice is valid and equivalent to returning no duplicates.

Integer overflow is not a concern because the constraints limit values to at most 10^5.

Worked Examples

Example 1

Input:

nums = [4,3,2,7,8,2,3,1]

Step	Current Number	Target Index	Array State	Result
Start	-	-	`[4,3,2,7,8,2,3,1]`	`[]`
1	4	3	`[4,3,2,-7,8,2,3,1]`	`[]`
2	3	2	`[4,3,-2,-7,8,2,3,1]`	`[]`
3	2	1	`[4,-3,-2,-7,8,2,3,1]`	`[]`
4	7	6	`[4,-3,-2,-7,8,2,-3,1]`	`[]`
5	8	7	`[4,-3,-2,-7,8,2,-3,-1]`	`[]`
6	2	1	`[4,-3,-2,-7,8,2,-3,-1]`	`[2]`
7	3	2	`[4,-3,-2,-7,8,2,-3,-1]`	`[2,3]`
8	1	0	`[-4,-3,-2,-7,8,2,-3,-1]`	`[2,3]`

Final output:

[2,3]

Example 2

Input:

nums = [1,1,2]

Step	Current Number	Target Index	Array State	Result
Start	-	-	`[1,1,2]`	`[]`
1	1	0	`[-1,1,2]`	`[]`
2	1	0	`[-1,1,2]`	`[1]`
3	2	1	`[-1,-1,2]`	`[1]`

Final output:

[1]

Example 3

Input:

nums = [1]

Step	Current Number	Target Index	Array State	Result
Start	-	-	`[1]`	`[]`
1	1	0	`[-1]`	`[]`

Final output:

[]

Complexity Analysis

Measure	Complexity	Explanation
Time	`O(n)`	Each element is processed exactly once
Space	`O(1)`	Uses the input array itself for bookkeeping

The runtime is linear because the algorithm performs a constant amount of work for each element in the array.

The auxiliary space usage is constant because no extra data structures proportional to the input size are allocated. The result list does not count toward auxiliary space since the problem explicitly excludes output storage from the space calculation.

Test Cases

from typing import List

class Solution:
    def findDuplicates(self, nums: List[int]) -> List[int]:
        duplicates = []

        for num in nums:
            index = abs(num) - 1

            if nums[index] < 0:
                duplicates.append(abs(num))
            else:
                nums[index] *= -1

        return duplicates

solution = Solution()

assert solution.findDuplicates([4,3,2,7,8,2,3,1]) == [2,3]  # standard example with two duplicates
assert solution.findDuplicates([1,1,2]) == [1]  # single duplicate
assert solution.findDuplicates([1]) == []  # single element, no duplicates
assert solution.findDuplicates([1,2,3,4,5]) == []  # no duplicates at all
assert solution.findDuplicates([2,2]) == [2]  # smallest valid duplicate case
assert solution.findDuplicates([1,2,2,3,3,4]) == [2,3]  # multiple duplicates
assert solution.findDuplicates([5,4,3,2,1,1]) == [1]  # duplicate at the end
assert solution.findDuplicates([1,1,2,2,3,3]) == [1,2,3]  # every number duplicated
assert solution.findDuplicates([3,3,3]) == [3,3]  # stress behavior outside official constraints

Test	Why
`[4,3,2,7,8,2,3,1]`	Standard mixed example
`[1,1,2]`	Small array with one duplicate
`[1]`	Minimum size input
`[1,2,3,4,5]`	No duplicates present
`[2,2]`	Smallest valid duplicate scenario
`[1,2,2,3,3,4]`	Multiple distinct duplicates
`[5,4,3,2,1,1]`	Duplicate appears at end
`[1,1,2,2,3,3]`	Every value duplicated once
`[3,3,3]`	Invalid input stress test

Edge Cases

One important edge case is an array with no duplicates at all, such as [1,2,3,4]. A buggy implementation might accidentally append values because of incorrect sign handling. In this implementation, every number flips its corresponding index exactly once, so no duplicate is reported unless the index is encountered again.

Another critical edge case is the smallest possible array, [1]. Since the algorithm converts values into indices using value - 1, it is important that index calculations remain valid even at the lower bound. The implementation safely maps 1 to index 0.

A more subtle edge case involves already modified values during iteration. For example, after processing some numbers, parts of the array become negative. If we used the raw value directly, we could produce invalid indices. The implementation avoids this issue by always using abs(num) before computing the index.

Another edge case is when duplicates appear near each other, such as [2,2]. Some incorrect solutions may double count or fail because they modify values too early. Here, the first 2 marks index 1 as negative, and the second 2 immediately detects that the index is already negative, correctly identifying the duplicate exactly once.