LeetCode 2722 - Join Two Arrays by ID
This problem asks us to merge two arrays of JSON objects based on a shared id field. Every object in both arrays contains a unique integer id, and our goal is to produce a single merged array where objects with the same id are combined together.
Difficulty: 🟡 Medium
Topics: —
Solution
Problem Understanding
This problem asks us to merge two arrays of JSON objects based on a shared id field. Every object in both arrays contains a unique integer id, and our goal is to produce a single merged array where objects with the same id are combined together.
More specifically, we are given two arrays, arr1 and arr2. Each object inside these arrays represents a collection of key-value pairs, with id acting as the unique identifier. If an id appears in only one array, we simply include that object unchanged in the final result. However, if the same id exists in both arrays, we must merge the two objects into one.
The merging rule is important. When both objects contain the same key, the value from arr2 takes precedence and overwrites the value from arr1. If a key exists in only one object, that key-value pair is preserved. This merge is shallow, meaning nested objects are replaced entirely rather than recursively merged.
The output must contain exactly one object for every unique id, and the final array must be sorted in ascending order by id.
The constraints tell us several important things about the scale of the problem. The JSON string length can be as large as 10^6, which implies the input may contain many objects or large object structures. Because of this, repeatedly scanning arrays to find matching id values would be inefficient. We should aim for a solution close to linear time.
There are also guarantees that simplify the implementation. Within each individual array, id values are unique, meaning we never have duplicate id entries inside arr1 or inside arr2. This removes ambiguity during merging because we only ever need to handle at most one object per id in each array.
Several edge cases are worth identifying upfront. One important case is when the two arrays have completely disjoint id values, in which case the result is effectively a concatenation followed by sorting. Another case is when every id overlaps, meaning every object must be merged and values from arr2 override matching keys. A subtle edge case involves nested objects or arrays, such as in Example 3. Since the merge is shallow, nested values are replaced entirely rather than merged field by field. Finally, one array may be empty, in which case the result is simply the sorted contents of the other array.
Approaches
Brute Force Approach
A straightforward approach is to iterate through every object in arr1 and, for each object, search through arr2 to find a matching id.
If a matching object is found, we merge the two objects and add the result to the output. If no matching object exists, we add the original object unchanged. After processing arr1, we would then iterate through arr2 again and add any objects whose id did not already appear.
This approach produces the correct result because every object is considered and every possible id match is checked. However, the main issue is efficiency. For every object in arr1, we may scan all of arr2, leading to quadratic time complexity.
Given the large input constraints, an O(n × m) solution is too slow.
Optimal Approach, Hash Map by id
The key observation is that object merging is fundamentally an id lookup problem. Instead of repeatedly searching for matching id values, we can store objects in a hash map where the key is the id.
We first insert every object from arr1 into a hash map. Then, as we process arr2, we check whether the id already exists.
If the id does not exist, we simply add the object. If it already exists, we merge the existing object with the current object using dictionary/object merging semantics, ensuring arr2 values override arr1.
After processing both arrays, the hash map contains exactly one merged object for every unique id. We then extract all values and sort them by id.
This works efficiently because hash map lookups and insertions are approximately O(1) on average.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n × m) | O(n + m) | Repeatedly scans arrays to find matching id values |
| Optimal | O(n + m + k log k) | O(k) | Uses hash map for constant-time lookups, sorts final result |
Here, n is the length of arr1, m is the length of arr2, and k is the number of unique id values.
Algorithm Walkthrough
- Create a hash map where the key is an
idand the value is the corresponding object.
We choose a hash map because it provides near constant-time insertion and lookup. Since the problem revolves around matching id values, this data structure naturally fits the requirement.
2. Iterate through arr1.
Insert each object into the hash map using its id as the key. At this stage, the map contains the initial version of all objects.
3. Iterate through arr2.
For each object:
- If the
idis not already in the map, insert it directly. - If the
idalready exists, merge the existing object with the current object.
The merge should preserve keys from both objects, while keys from arr2 override duplicate keys from arr1.
4. Extract all objects from the hash map.
At this point, every unique id has exactly one merged object stored in the map.
5. Sort the objects by id in ascending order.
Since hash maps do not preserve ordering, sorting ensures we satisfy the problem requirement. 6. Return the sorted array.
Why it works
The algorithm maintains an important invariant: after processing both arrays, the hash map contains exactly one correctly merged object for every unique id.
Initially, inserting arr1 ensures every object exists in the map. Processing arr2 either inserts new objects or updates existing ones according to the overwrite rule. Since every id is unique within each input array, there is never ambiguity about which object should be merged. Finally, sorting guarantees the required ascending order by id.
Python Solution
from typing import List, Dict, Any
class Solution:
def join(self, arr1: List[Dict[str, Any]], arr2: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
merged_by_id = {}
# Add all objects from arr1
for obj in arr1:
merged_by_id[obj["id"]] = obj.copy()
# Merge or insert objects from arr2
for obj in arr2:
obj_id = obj["id"]
if obj_id in merged_by_id:
merged_by_id[obj_id].update(obj)
else:
merged_by_id[obj_id] = obj.copy()
# Return sorted result
return sorted(
merged_by_id.values(),
key=lambda item: item["id"]
)
The implementation begins by creating a dictionary called merged_by_id, where each key is an id and each value is the corresponding object.
We first iterate through arr1 and insert copies of each object into the dictionary. Using .copy() prevents accidental mutation of the original input data.
Next, we process arr2. For each object, we extract its id. If that id already exists in the dictionary, we merge the new object into the existing one using .update(). This method naturally implements the overwrite rule because duplicate keys from arr2 replace earlier values from arr1.
If the id does not already exist, we insert a copy of the object directly.
Finally, we extract all dictionary values and sort them using the id key to satisfy the required ascending order.
Go Solution
package main
import "sort"
func join(arr1 []map[string]interface{}, arr2 []map[string]interface{}) []map[string]interface{} {
mergedByID := make(map[int]map[string]interface{})
// Add all objects from arr1
for _, obj := range arr1 {
id := obj["id"].(int)
copied := make(map[string]interface{})
for key, value := range obj {
copied[key] = value
}
mergedByID[id] = copied
}
// Merge or insert objects from arr2
for _, obj := range arr2 {
id := obj["id"].(int)
if existing, exists := mergedByID[id]; exists {
for key, value := range obj {
existing[key] = value
}
} else {
copied := make(map[string]interface{})
for key, value := range obj {
copied[key] = value
}
mergedByID[id] = copied
}
}
// Convert map to slice
result := make([]map[string]interface{}, 0, len(mergedByID))
for _, obj := range mergedByID {
result = append(result, obj)
}
// Sort by id
sort.Slice(result, func(i, j int) bool {
return result[i]["id"].(int) < result[j]["id"].(int)
})
return result
}
The Go implementation follows the same high-level algorithm as the Python version, but there are some language-specific differences.
Go maps are reference types, so copying objects explicitly is important to avoid unintended mutations of the original input data. Unlike Python's .update() method, Go requires manually iterating over keys and assigning values during merging.
The result is stored in a slice and sorted using sort.Slice, where a custom comparison function orders objects by their id.
There are no integer overflow concerns here because the problem guarantees valid integer id values within reasonable bounds. Also, Go treats nil and empty slices differently, but since we always return a constructed slice, behavior remains consistent.
Worked Examples
Example 1
Input
arr1 = [
{"id": 1, "x": 1},
{"id": 2, "x": 9}
]
arr2 = [
{"id": 3, "x": 5}
]
Execution Trace
| Step | Action | Hash Map State |
|---|---|---|
| 1 | Add {"id":1,"x":1} |
{1: {"id":1,"x":1}} |
| 2 | Add {"id":2,"x":9} |
{1: {...}, 2: {"id":2,"x":9}} |
| 3 | Process {"id":3,"x":5} |
{1: {...}, 2: {...}, 3: {"id":3,"x":5}} |
| 4 | Sort by id |
[1, 2, 3] |
Result
[
{"id": 1, "x": 1},
{"id": 2, "x": 9},
{"id": 3, "x": 5}
]
No id values overlap, so all objects are included unchanged.
Example 2
Input
arr1 = [
{"id": 1, "x": 2, "y": 3},
{"id": 2, "x": 3, "y": 6}
]
arr2 = [
{"id": 2, "x": 10, "y": 20},
{"id": 3, "x": 0, "y": 0}
]
Execution Trace
| Step | Action | Hash Map State |
|---|---|---|
| 1 | Insert id=1 |
{1: {"id":1,"x":2,"y":3}} |
| 2 | Insert id=2 |
{1: {...}, 2: {"id":2,"x":3,"y":6}} |
| 3 | Merge id=2 from arr2 |
{"id":2,"x":10,"y":20} |
| 4 | Insert id=3 |
{1: {...}, 2: {...}, 3: {"id":3,"x":0,"y":0}} |
| 5 | Sort by id |
[1, 2, 3] |
Result
[
{"id": 1, "x": 2, "y": 3},
{"id": 2, "x": 10, "y": 20},
{"id": 3, "x": 0, "y": 0}
]
The object with id=2 is merged, and arr2 overrides the values of x and y.
Example 3
Input
arr1 = [
{"id": 1, "b": {"b": 94}, "v": [4, 3], "y": 48}
]
arr2 = [
{"id": 1, "b": {"c": 84}, "v": [1, 3]}
]
Execution Trace
| Step | Action | Hash Map State |
|---|---|---|
| 1 | Insert object from arr1 |
{"id":1,"b":{"b":94},"v":[4,3],"y":48} |
| 2 | Merge object from arr2 |
{"id":1,"b":{"c":84},"v":[1,3],"y":48} |
| 3 | Sort | unchanged |
Result
[
{"id": 1, "b": {"c": 84}, "v": [1, 3], "y": 48}
]
Notice that nested objects are replaced entirely, not recursively merged.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + m + k log k) | We process both arrays once and sort k unique objects |
| Space | O(k) | The hash map stores one object per unique id |
The insertion and merging operations each take constant time on average because hash map access is O(1). We process every object exactly once, resulting in O(n + m) work. The only additional cost comes from sorting the final result, which takes O(k log k) where k is the number of unique id values.
Test Cases
solution = Solution()
# Example 1, no overlapping ids
assert solution.join(
[{"id": 1, "x": 1}, {"id": 2, "x": 9}],
[{"id": 3, "x": 5}]
) == [
{"id": 1, "x": 1},
{"id": 2, "x": 9},
{"id": 3, "x": 5}
]
# Example 2, overlapping id with overwrite
assert solution.join(
[{"id": 1, "x": 2, "y": 3}, {"id": 2, "x": 3, "y": 6}],
[{"id": 2, "x": 10, "y": 20}, {"id": 3, "x": 0, "y": 0}]
) == [
{"id": 1, "x": 2, "y": 3},
{"id": 2, "x": 10, "y": 20},
{"id": 3, "x": 0, "y": 0}
]
# Example 3, nested objects replaced entirely
assert solution.join(
[{"id": 1, "b": {"b": 94}, "v": [4, 3], "y": 48}],
[{"id": 1, "b": {"c": 84}, "v": [1, 3]}]
) == [
{"id": 1, "b": {"c": 84}, "v": [1, 3], "y": 48}
]
# Empty arr2
assert solution.join(
[{"id": 1, "x": 10}],
[]
) == [
{"id": 1, "x": 10}
]
# Empty arr1
assert solution.join(
[],
[{"id": 2, "x": 20}]
) == [
{"id": 2, "x": 20}
]
# Complete overlap with multiple keys
assert solution.join(
[{"id": 1, "a": 1, "b": 2}],
[{"id": 1, "b": 10, "c": 20}]
) == [
{"id": 1, "a": 1, "b": 10, "c": 20}
]
# Sorting requirement
assert solution.join(
[{"id": 5, "x": 1}],
[{"id": 2, "x": 2}]
) == [
{"id": 2, "x": 2},
{"id": 5, "x": 1}
]
# Large id values
assert solution.join(
[{"id": 1000000, "x": 1}],
[{"id": 999999, "x": 2}]
) == [
{"id": 999999, "x": 2},
{"id": 1000000, "x": 1}
]
| Test | Why |
|---|---|
| Example 1 | Validates no overlap behavior |
| Example 2 | Validates overwriting from arr2 |
| Example 3 | Confirms shallow merge behavior |
Empty arr2 |
Ensures one empty array is handled |
Empty arr1 |
Ensures symmetric empty handling |
| Complete overlap | Tests merging of multiple keys |
| Sorting requirement | Verifies ascending id ordering |
Large id values |
Ensures large integers are handled correctly |
Edge Cases
Completely Disjoint id Values
One important edge case occurs when the two arrays have no overlapping id values at all. A naive implementation might incorrectly attempt merging or fail to include objects from both arrays. Our implementation handles this naturally because new id values are simply inserted into the hash map without modification.
Full Overlap Between Arrays
Another important case is when every object in arr1 has a matching id in arr2. This can expose bugs where overwriting does not happen correctly or where original values are accidentally preserved. Since we explicitly use .update() in Python and key reassignment in Go, every overlapping field is correctly overridden by arr2.
Nested Objects and Arrays
The problem includes nested JSON structures, which could mislead developers into implementing recursive merging. However, the requirement is a shallow merge only. For example, if both objects contain a b key with nested objects, the entire value from arr2 replaces the one from arr1. Using dictionary updates naturally enforces this behavior.
Sorting Requirement
Hash maps do not maintain sorted ordering. A common mistake is to return objects directly from the hash map, producing unpredictable order. Our implementation explicitly sorts by id, guaranteeing compliance with the problem specification.