LeetCode 2722 - Join Two Arrays by ID

This problem asks us to merge two arrays of JSON objects based on a shared id field. Every object in both arrays contains a unique integer id, and our goal is to produce a single merged array where objects with the same id are combined together.

LeetCode Problem 2722

Difficulty: 🟡 Medium
Topics:

Solution

Problem Understanding

This problem asks us to merge two arrays of JSON objects based on a shared id field. Every object in both arrays contains a unique integer id, and our goal is to produce a single merged array where objects with the same id are combined together.

More specifically, we are given two arrays, arr1 and arr2. Each object inside these arrays represents a collection of key-value pairs, with id acting as the unique identifier. If an id appears in only one array, we simply include that object unchanged in the final result. However, if the same id exists in both arrays, we must merge the two objects into one.

The merging rule is important. When both objects contain the same key, the value from arr2 takes precedence and overwrites the value from arr1. If a key exists in only one object, that key-value pair is preserved. This merge is shallow, meaning nested objects are replaced entirely rather than recursively merged.

The output must contain exactly one object for every unique id, and the final array must be sorted in ascending order by id.

The constraints tell us several important things about the scale of the problem. The JSON string length can be as large as 10^6, which implies the input may contain many objects or large object structures. Because of this, repeatedly scanning arrays to find matching id values would be inefficient. We should aim for a solution close to linear time.

There are also guarantees that simplify the implementation. Within each individual array, id values are unique, meaning we never have duplicate id entries inside arr1 or inside arr2. This removes ambiguity during merging because we only ever need to handle at most one object per id in each array.

Several edge cases are worth identifying upfront. One important case is when the two arrays have completely disjoint id values, in which case the result is effectively a concatenation followed by sorting. Another case is when every id overlaps, meaning every object must be merged and values from arr2 override matching keys. A subtle edge case involves nested objects or arrays, such as in Example 3. Since the merge is shallow, nested values are replaced entirely rather than merged field by field. Finally, one array may be empty, in which case the result is simply the sorted contents of the other array.

Approaches

Brute Force Approach

A straightforward approach is to iterate through every object in arr1 and, for each object, search through arr2 to find a matching id.

If a matching object is found, we merge the two objects and add the result to the output. If no matching object exists, we add the original object unchanged. After processing arr1, we would then iterate through arr2 again and add any objects whose id did not already appear.

This approach produces the correct result because every object is considered and every possible id match is checked. However, the main issue is efficiency. For every object in arr1, we may scan all of arr2, leading to quadratic time complexity.

Given the large input constraints, an O(n × m) solution is too slow.

Optimal Approach, Hash Map by id

The key observation is that object merging is fundamentally an id lookup problem. Instead of repeatedly searching for matching id values, we can store objects in a hash map where the key is the id.

We first insert every object from arr1 into a hash map. Then, as we process arr2, we check whether the id already exists.

If the id does not exist, we simply add the object. If it already exists, we merge the existing object with the current object using dictionary/object merging semantics, ensuring arr2 values override arr1.

After processing both arrays, the hash map contains exactly one merged object for every unique id. We then extract all values and sort them by id.

This works efficiently because hash map lookups and insertions are approximately O(1) on average.

Approach Time Complexity Space Complexity Notes
Brute Force O(n × m) O(n + m) Repeatedly scans arrays to find matching id values
Optimal O(n + m + k log k) O(k) Uses hash map for constant-time lookups, sorts final result

Here, n is the length of arr1, m is the length of arr2, and k is the number of unique id values.

Algorithm Walkthrough

  1. Create a hash map where the key is an id and the value is the corresponding object.

We choose a hash map because it provides near constant-time insertion and lookup. Since the problem revolves around matching id values, this data structure naturally fits the requirement. 2. Iterate through arr1.

Insert each object into the hash map using its id as the key. At this stage, the map contains the initial version of all objects. 3. Iterate through arr2.

For each object:

  • If the id is not already in the map, insert it directly.
  • If the id already exists, merge the existing object with the current object.

The merge should preserve keys from both objects, while keys from arr2 override duplicate keys from arr1. 4. Extract all objects from the hash map.

At this point, every unique id has exactly one merged object stored in the map. 5. Sort the objects by id in ascending order.

Since hash maps do not preserve ordering, sorting ensures we satisfy the problem requirement. 6. Return the sorted array.

Why it works

The algorithm maintains an important invariant: after processing both arrays, the hash map contains exactly one correctly merged object for every unique id.

Initially, inserting arr1 ensures every object exists in the map. Processing arr2 either inserts new objects or updates existing ones according to the overwrite rule. Since every id is unique within each input array, there is never ambiguity about which object should be merged. Finally, sorting guarantees the required ascending order by id.

Python Solution

from typing import List, Dict, Any

class Solution:
    def join(self, arr1: List[Dict[str, Any]], arr2: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        merged_by_id = {}

        # Add all objects from arr1
        for obj in arr1:
            merged_by_id[obj["id"]] = obj.copy()

        # Merge or insert objects from arr2
        for obj in arr2:
            obj_id = obj["id"]

            if obj_id in merged_by_id:
                merged_by_id[obj_id].update(obj)
            else:
                merged_by_id[obj_id] = obj.copy()

        # Return sorted result
        return sorted(
            merged_by_id.values(),
            key=lambda item: item["id"]
        )

The implementation begins by creating a dictionary called merged_by_id, where each key is an id and each value is the corresponding object.

We first iterate through arr1 and insert copies of each object into the dictionary. Using .copy() prevents accidental mutation of the original input data.

Next, we process arr2. For each object, we extract its id. If that id already exists in the dictionary, we merge the new object into the existing one using .update(). This method naturally implements the overwrite rule because duplicate keys from arr2 replace earlier values from arr1.

If the id does not already exist, we insert a copy of the object directly.

Finally, we extract all dictionary values and sort them using the id key to satisfy the required ascending order.

Go Solution

package main

import "sort"

func join(arr1 []map[string]interface{}, arr2 []map[string]interface{}) []map[string]interface{} {
	mergedByID := make(map[int]map[string]interface{})

	// Add all objects from arr1
	for _, obj := range arr1 {
		id := obj["id"].(int)

		copied := make(map[string]interface{})
		for key, value := range obj {
			copied[key] = value
		}

		mergedByID[id] = copied
	}

	// Merge or insert objects from arr2
	for _, obj := range arr2 {
		id := obj["id"].(int)

		if existing, exists := mergedByID[id]; exists {
			for key, value := range obj {
				existing[key] = value
			}
		} else {
			copied := make(map[string]interface{})
			for key, value := range obj {
				copied[key] = value
			}

			mergedByID[id] = copied
		}
	}

	// Convert map to slice
	result := make([]map[string]interface{}, 0, len(mergedByID))
	for _, obj := range mergedByID {
		result = append(result, obj)
	}

	// Sort by id
	sort.Slice(result, func(i, j int) bool {
		return result[i]["id"].(int) < result[j]["id"].(int)
	})

	return result
}

The Go implementation follows the same high-level algorithm as the Python version, but there are some language-specific differences.

Go maps are reference types, so copying objects explicitly is important to avoid unintended mutations of the original input data. Unlike Python's .update() method, Go requires manually iterating over keys and assigning values during merging.

The result is stored in a slice and sorted using sort.Slice, where a custom comparison function orders objects by their id.

There are no integer overflow concerns here because the problem guarantees valid integer id values within reasonable bounds. Also, Go treats nil and empty slices differently, but since we always return a constructed slice, behavior remains consistent.

Worked Examples

Example 1

Input

arr1 = [
    {"id": 1, "x": 1},
    {"id": 2, "x": 9}
]

arr2 = [
    {"id": 3, "x": 5}
]

Execution Trace

Step Action Hash Map State
1 Add {"id":1,"x":1} {1: {"id":1,"x":1}}
2 Add {"id":2,"x":9} {1: {...}, 2: {"id":2,"x":9}}
3 Process {"id":3,"x":5} {1: {...}, 2: {...}, 3: {"id":3,"x":5}}
4 Sort by id [1, 2, 3]

Result

[
    {"id": 1, "x": 1},
    {"id": 2, "x": 9},
    {"id": 3, "x": 5}
]

No id values overlap, so all objects are included unchanged.

Example 2

Input

arr1 = [
    {"id": 1, "x": 2, "y": 3},
    {"id": 2, "x": 3, "y": 6}
]

arr2 = [
    {"id": 2, "x": 10, "y": 20},
    {"id": 3, "x": 0, "y": 0}
]

Execution Trace

Step Action Hash Map State
1 Insert id=1 {1: {"id":1,"x":2,"y":3}}
2 Insert id=2 {1: {...}, 2: {"id":2,"x":3,"y":6}}
3 Merge id=2 from arr2 {"id":2,"x":10,"y":20}
4 Insert id=3 {1: {...}, 2: {...}, 3: {"id":3,"x":0,"y":0}}
5 Sort by id [1, 2, 3]

Result

[
    {"id": 1, "x": 2, "y": 3},
    {"id": 2, "x": 10, "y": 20},
    {"id": 3, "x": 0, "y": 0}
]

The object with id=2 is merged, and arr2 overrides the values of x and y.

Example 3

Input

arr1 = [
    {"id": 1, "b": {"b": 94}, "v": [4, 3], "y": 48}
]

arr2 = [
    {"id": 1, "b": {"c": 84}, "v": [1, 3]}
]

Execution Trace

Step Action Hash Map State
1 Insert object from arr1 {"id":1,"b":{"b":94},"v":[4,3],"y":48}
2 Merge object from arr2 {"id":1,"b":{"c":84},"v":[1,3],"y":48}
3 Sort unchanged

Result

[
    {"id": 1, "b": {"c": 84}, "v": [1, 3], "y": 48}
]

Notice that nested objects are replaced entirely, not recursively merged.

Complexity Analysis

Measure Complexity Explanation
Time O(n + m + k log k) We process both arrays once and sort k unique objects
Space O(k) The hash map stores one object per unique id

The insertion and merging operations each take constant time on average because hash map access is O(1). We process every object exactly once, resulting in O(n + m) work. The only additional cost comes from sorting the final result, which takes O(k log k) where k is the number of unique id values.

Test Cases

solution = Solution()

# Example 1, no overlapping ids
assert solution.join(
    [{"id": 1, "x": 1}, {"id": 2, "x": 9}],
    [{"id": 3, "x": 5}]
) == [
    {"id": 1, "x": 1},
    {"id": 2, "x": 9},
    {"id": 3, "x": 5}
]

# Example 2, overlapping id with overwrite
assert solution.join(
    [{"id": 1, "x": 2, "y": 3}, {"id": 2, "x": 3, "y": 6}],
    [{"id": 2, "x": 10, "y": 20}, {"id": 3, "x": 0, "y": 0}]
) == [
    {"id": 1, "x": 2, "y": 3},
    {"id": 2, "x": 10, "y": 20},
    {"id": 3, "x": 0, "y": 0}
]

# Example 3, nested objects replaced entirely
assert solution.join(
    [{"id": 1, "b": {"b": 94}, "v": [4, 3], "y": 48}],
    [{"id": 1, "b": {"c": 84}, "v": [1, 3]}]
) == [
    {"id": 1, "b": {"c": 84}, "v": [1, 3], "y": 48}
]

# Empty arr2
assert solution.join(
    [{"id": 1, "x": 10}],
    []
) == [
    {"id": 1, "x": 10}
]

# Empty arr1
assert solution.join(
    [],
    [{"id": 2, "x": 20}]
) == [
    {"id": 2, "x": 20}
]

# Complete overlap with multiple keys
assert solution.join(
    [{"id": 1, "a": 1, "b": 2}],
    [{"id": 1, "b": 10, "c": 20}]
) == [
    {"id": 1, "a": 1, "b": 10, "c": 20}
]

# Sorting requirement
assert solution.join(
    [{"id": 5, "x": 1}],
    [{"id": 2, "x": 2}]
) == [
    {"id": 2, "x": 2},
    {"id": 5, "x": 1}
]

# Large id values
assert solution.join(
    [{"id": 1000000, "x": 1}],
    [{"id": 999999, "x": 2}]
) == [
    {"id": 999999, "x": 2},
    {"id": 1000000, "x": 1}
]
Test Why
Example 1 Validates no overlap behavior
Example 2 Validates overwriting from arr2
Example 3 Confirms shallow merge behavior
Empty arr2 Ensures one empty array is handled
Empty arr1 Ensures symmetric empty handling
Complete overlap Tests merging of multiple keys
Sorting requirement Verifies ascending id ordering
Large id values Ensures large integers are handled correctly

Edge Cases

Completely Disjoint id Values

One important edge case occurs when the two arrays have no overlapping id values at all. A naive implementation might incorrectly attempt merging or fail to include objects from both arrays. Our implementation handles this naturally because new id values are simply inserted into the hash map without modification.

Full Overlap Between Arrays

Another important case is when every object in arr1 has a matching id in arr2. This can expose bugs where overwriting does not happen correctly or where original values are accidentally preserved. Since we explicitly use .update() in Python and key reassignment in Go, every overlapping field is correctly overridden by arr2.

Nested Objects and Arrays

The problem includes nested JSON structures, which could mislead developers into implementing recursive merging. However, the requirement is a shallow merge only. For example, if both objects contain a b key with nested objects, the entire value from arr2 replaces the one from arr1. Using dictionary updates naturally enforces this behavior.

Sorting Requirement

Hash maps do not maintain sorted ordering. A common mistake is to return objects directly from the hash map, producing unpredictable order. Our implementation explicitly sorts by id, guaranteeing compliance with the problem specification.