LeetCode 1649 - Create Sorted Array through Instructions
The problem asks us to simulate building a sorted array incrementally. We process the instructions array from left to ri
Difficulty: 🔴 Hard
Topics: Array, Binary Search, Divide and Conquer, Binary Indexed Tree, Segment Tree, Merge Sort, Ordered Set
Solution
Problem Understanding
The problem asks us to simulate building a sorted array incrementally. We process the instructions array from left to right, and for each value, we insert it into a container called nums while maintaining sorted order.
However, the actual insertion is not the difficult part. The important part is computing the insertion cost.
For every value instructions[i], the insertion cost is defined as:
- The number of existing elements in
numsthat are strictly smaller thaninstructions[i] - The number of existing elements in
numsthat are strictly greater thaninstructions[i]
We take the minimum of those two values and add it to the total answer.
The container starts empty, so the first insertion always has cost 0.
The final answer is the sum of all insertion costs, modulo 10^9 + 7.
The constraints are very important:
instructions.length <= 10^5instructions[i] <= 10^5
A quadratic solution will not work because 10^5 × 10^5 = 10^10 operations is far too large. We need something close to O(n log n).
The key challenge is that for every insertion, we need to efficiently answer two queries:
- How many previous numbers are smaller than the current value?
- How many previous numbers are greater than the current value?
This is a classic frequency counting and prefix sum problem.
Several edge cases are important:
- Repeated values, because the problem uses strictly less and strictly greater comparisons.
- Already sorted arrays, where every insertion cost becomes
0. - Reverse sorted arrays, where one side of the minimum is always
0. - Large duplicate-heavy inputs, where efficient counting matters.
- Very large answers, which require modulo arithmetic.
Approaches
Brute Force Approach
The simplest solution is to explicitly maintain the sorted container nums.
For every instruction:
- Find the insertion position.
- Count how many elements are smaller.
- Count how many elements are greater.
- Insert the value into the sorted list.
We could use binary search to find the insertion point, but insertion into a Python list or array still requires shifting elements, which costs O(n) time.
Since we perform this operation for every element, the total complexity becomes O(n^2) in the worst case.
This approach is correct because it directly simulates the problem statement, but it is too slow for 10^5 elements.
Optimal Approach
The important observation is that we do not actually need to store the fully sorted array.
We only need frequency information:
- How many previous numbers are smaller than
x - How many previous numbers are greater than
x
This naturally leads to a data structure that supports:
- Point updates
- Prefix sum queries
A Binary Indexed Tree, also called a Fenwick Tree, is ideal for this.
Since instructions[i] <= 10^5, we can use the values directly as indices in the Fenwick Tree.
For each value x:
less = query(x - 1)greater = inserted_so_far - query(x)
Then:
cost += min(less, greater)
Finally, we insert the value into the Fenwick Tree.
Each operation costs O(log M) where M = 10^5.
This gives a total complexity of O(n log M).
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | Maintains explicit sorted array and inserts each element |
| Optimal | O(n log M) | O(M) | Uses Fenwick Tree for prefix frequency counting |
Here, M is the maximum possible value in instructions.
Algorithm Walkthrough
Step 1: Initialize the Fenwick Tree
We create a Fenwick Tree large enough to store frequencies for all possible values.
The tree supports:
- Updating the frequency of a value
- Querying how many inserted numbers are less than or equal to a value
Step 2: Process Instructions Left to Right
We iterate through each value x in instructions.
At this moment, the Fenwick Tree contains frequencies of all previously inserted values.
Step 3: Count Smaller Elements
To count numbers strictly smaller than x, we query the prefix sum up to x - 1.
less = query(x - 1)
This tells us how many inserted numbers are smaller than x.
Step 4: Count Greater Elements
To count numbers strictly greater than x, we first compute how many numbers are less than or equal to x.
not_greater = query(x)
If we have already inserted i elements, then:
greater = i - not_greater
Step 5: Add the Minimum Cost
The insertion cost is:
min(less, greater)
We add this value to the running total.
Step 6: Insert the Current Value
We update the Fenwick Tree frequency for x.
update(x, +1)
This ensures future queries include the current number.
Step 7: Return the Result
Since the answer may become large, we return:
answer % (10^9 + 7)
Why it works
The Fenwick Tree always stores the exact frequency distribution of all previously inserted values.
Because prefix sums efficiently count how many numbers are less than or equal to a value, we can derive both:
- Strictly smaller counts
- Strictly greater counts
for every insertion.
Since every step computes the exact insertion cost defined by the problem, the accumulated result is correct.
Python Solution
from typing import List
class FenwickTree:
def __init__(self, size: int):
self.size = size
self.tree = [0] * (size + 1)
def update(self, index: int, delta: int) -> None:
while index <= self.size:
self.tree[index] += delta
index += index & -index
def query(self, index: int) -> int:
total = 0
while index > 0:
total += self.tree[index]
index -= index & -index
return total
class Solution:
def createSortedArray(self, instructions: List[int]) -> int:
MOD = 10**9 + 7
max_value = max(instructions)
fenwick = FenwickTree(max_value)
total_cost = 0
for i, value in enumerate(instructions):
less = fenwick.query(value - 1)
less_or_equal = fenwick.query(value)
greater = i - less_or_equal
total_cost += min(less, greater)
total_cost %= MOD
fenwick.update(value, 1)
return total_cost
The implementation begins with a Fenwick Tree class. The update method increases the frequency of a value, while the query method returns the prefix sum from 1 through index.
Inside createSortedArray, we first determine the maximum value so we know how large the Fenwick Tree must be.
As we iterate through the instructions:
lesscounts numbers strictly smaller than the current value.less_or_equalcounts numbers less than or equal to the current value.greateris derived from how many values have already been inserted.
The insertion cost is added to the running answer, and then the current value is inserted into the Fenwick Tree.
Modulo arithmetic is applied after every addition to avoid overflow.
Go Solution
package main
func createSortedArray(instructions []int) int {
const MOD int = 1_000_000_007
maxValue := 0
for _, value := range instructions {
if value > maxValue {
maxValue = value
}
}
fenwick := make([]int, maxValue+1)
update := func(index int, delta int) {
for index <= maxValue {
fenwick[index] += delta
index += index & -index
}
}
query := func(index int) int {
sum := 0
for index > 0 {
sum += fenwick[index]
index -= index & -index
}
return sum
}
totalCost := 0
for i, value := range instructions {
less := query(value - 1)
lessOrEqual := query(value)
greater := i - lessOrEqual
if less < greater {
totalCost += less
} else {
totalCost += greater
}
totalCost %= MOD
update(value, 1)
}
return totalCost
}
The Go implementation follows the exact same logic as the Python solution.
Instead of defining a separate Fenwick Tree class, the tree is represented as a slice with local helper functions for update and query.
Go uses explicit integer arithmetic, so modulo operations are applied carefully after each addition. Since the constraints fit within 32 bit integers during intermediate operations, standard int is sufficient.
Worked Examples
Example 1
Input:
instructions = [1,5,6,2]
| Step | Value | Less | Greater | Cost | Fenwick Frequencies |
|---|---|---|---|---|---|
| 1 | 1 | 0 | 0 | 0 | {1:1} |
| 2 | 5 | 1 | 0 | 0 | {1:1,5:1} |
| 3 | 6 | 2 | 0 | 0 | {1:1,5:1,6:1} |
| 4 | 2 | 1 | 2 | 1 | {1:1,2:1,5:1,6:1} |
Total cost:
0 + 0 + 0 + 1 = 1
Example 2
Input:
instructions = [1,2,3,6,5,4]
| Step | Value | Less | Greater | Cost |
|---|---|---|---|---|
| 1 | 1 | 0 | 0 | 0 |
| 2 | 2 | 1 | 0 | 0 |
| 3 | 3 | 2 | 0 | 0 |
| 4 | 6 | 3 | 0 | 0 |
| 5 | 5 | 3 | 1 | 1 |
| 6 | 4 | 3 | 2 | 2 |
Total cost:
3
Example 3
Input:
instructions = [1,3,3,3,2,4,2,1,2]
| Step | Value | Less | Greater | Cost |
|---|---|---|---|---|
| 1 | 1 | 0 | 0 | 0 |
| 2 | 3 | 1 | 0 | 0 |
| 3 | 3 | 1 | 0 | 0 |
| 4 | 3 | 1 | 0 | 0 |
| 5 | 2 | 1 | 3 | 1 |
| 6 | 4 | 5 | 0 | 0 |
| 7 | 2 | 1 | 4 | 1 |
| 8 | 1 | 0 | 6 | 0 |
| 9 | 2 | 2 | 4 | 2 |
Total cost:
4
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log M) | Each update and query operation on the Fenwick Tree costs O(log M) |
| Space | O(M) | The Fenwick Tree stores frequencies up to the maximum value |
Here:
nis the number of instructionsMis the maximum value ininstructions
For each element, we perform:
- Two prefix sum queries
- One update
Each operation takes logarithmic time in the value range.
Test Cases
sol = Solution()
assert sol.createSortedArray([1,5,6,2]) == 1
# basic example from statement
assert sol.createSortedArray([1,2,3,6,5,4]) == 3
# increasing then decreasing
assert sol.createSortedArray([1,3,3,3,2,4,2,1,2]) == 4
# duplicates with mixed ordering
assert sol.createSortedArray([1]) == 0
# single element
assert sol.createSortedArray([1,2,3,4,5]) == 0
# already sorted ascending
assert sol.createSortedArray([5,4,3,2,1]) == 0
# descending order, minimum always zero
assert sol.createSortedArray([2,2,2,2]) == 0
# all equal elements
assert sol.createSortedArray([3,1,2]) == 1
# small mixed example
assert sol.createSortedArray([5,1,5,1,5,1]) == 0
# alternating extremes
assert sol.createSortedArray([2,1,3,1,2,3,2]) == 3
# repeated insertions with varying costs
Test Summary
| Test | Why |
|---|---|
[1,5,6,2] |
Validates the first official example |
[1,2,3,6,5,4] |
Tests increasing then decreasing pattern |
[1,3,3,3,2,4,2,1,2] |
Verifies handling of duplicates |
[1] |
Smallest possible input |
[1,2,3,4,5] |
Already sorted ascending case |
[5,4,3,2,1] |
Reverse sorted case |
[2,2,2,2] |
All duplicates |
[3,1,2] |
Small mixed ordering |
[5,1,5,1,5,1] |
Alternating large and small values |
[2,1,3,1,2,3,2] |
Complex repeated counting behavior |
Edge Cases
All Elements Equal
An input like:
[2,2,2,2]
is easy to mishandle because the problem requires strictly less and strictly greater comparisons.
Equal values should not contribute to either count.
The implementation correctly handles this because:
less = query(value - 1)greater = inserted - query(value)
Equal values are excluded from both calculations.
Strictly Increasing Input
For an input like:
[1,2,3,4,5]
every new value has:
- many smaller elements
- zero greater elements
Since the insertion cost uses the minimum, every insertion contributes 0.
The algorithm naturally handles this because greater becomes zero at every step.
Strictly Decreasing Input
For an input like:
[5,4,3,2,1]
every insertion has:
- zero smaller elements
- many greater elements
Again, the minimum is always 0.
This case is important because naive implementations sometimes incorrectly include equal or current elements in the counts.
The Fenwick Tree only stores previously inserted values, so the current element is never counted against itself.
Large Inputs With Maximum Values
The constraints allow:
instructions.length = 100000
instructions[i] = 100000
A naive insertion-based approach would become too slow.
The Fenwick Tree guarantees efficient logarithmic operations even at the maximum constraint size, making the solution scalable enough for all valid inputs.