LeetCode 1649 - Create Sorted Array through Instructions

The problem asks us to simulate building a sorted array incrementally. We process the instructions array from left to ri

LeetCode Problem 1649

Difficulty: 🔴 Hard
Topics: Array, Binary Search, Divide and Conquer, Binary Indexed Tree, Segment Tree, Merge Sort, Ordered Set

Solution

Problem Understanding

The problem asks us to simulate building a sorted array incrementally. We process the instructions array from left to right, and for each value, we insert it into a container called nums while maintaining sorted order.

However, the actual insertion is not the difficult part. The important part is computing the insertion cost.

For every value instructions[i], the insertion cost is defined as:

  • The number of existing elements in nums that are strictly smaller than instructions[i]
  • The number of existing elements in nums that are strictly greater than instructions[i]

We take the minimum of those two values and add it to the total answer.

The container starts empty, so the first insertion always has cost 0.

The final answer is the sum of all insertion costs, modulo 10^9 + 7.

The constraints are very important:

  • instructions.length <= 10^5
  • instructions[i] <= 10^5

A quadratic solution will not work because 10^5 × 10^5 = 10^10 operations is far too large. We need something close to O(n log n).

The key challenge is that for every insertion, we need to efficiently answer two queries:

  1. How many previous numbers are smaller than the current value?
  2. How many previous numbers are greater than the current value?

This is a classic frequency counting and prefix sum problem.

Several edge cases are important:

  • Repeated values, because the problem uses strictly less and strictly greater comparisons.
  • Already sorted arrays, where every insertion cost becomes 0.
  • Reverse sorted arrays, where one side of the minimum is always 0.
  • Large duplicate-heavy inputs, where efficient counting matters.
  • Very large answers, which require modulo arithmetic.

Approaches

Brute Force Approach

The simplest solution is to explicitly maintain the sorted container nums.

For every instruction:

  1. Find the insertion position.
  2. Count how many elements are smaller.
  3. Count how many elements are greater.
  4. Insert the value into the sorted list.

We could use binary search to find the insertion point, but insertion into a Python list or array still requires shifting elements, which costs O(n) time.

Since we perform this operation for every element, the total complexity becomes O(n^2) in the worst case.

This approach is correct because it directly simulates the problem statement, but it is too slow for 10^5 elements.

Optimal Approach

The important observation is that we do not actually need to store the fully sorted array.

We only need frequency information:

  • How many previous numbers are smaller than x
  • How many previous numbers are greater than x

This naturally leads to a data structure that supports:

  • Point updates
  • Prefix sum queries

A Binary Indexed Tree, also called a Fenwick Tree, is ideal for this.

Since instructions[i] <= 10^5, we can use the values directly as indices in the Fenwick Tree.

For each value x:

  • less = query(x - 1)
  • greater = inserted_so_far - query(x)

Then:

  • cost += min(less, greater)

Finally, we insert the value into the Fenwick Tree.

Each operation costs O(log M) where M = 10^5.

This gives a total complexity of O(n log M).

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Maintains explicit sorted array and inserts each element
Optimal O(n log M) O(M) Uses Fenwick Tree for prefix frequency counting

Here, M is the maximum possible value in instructions.

Algorithm Walkthrough

Step 1: Initialize the Fenwick Tree

We create a Fenwick Tree large enough to store frequencies for all possible values.

The tree supports:

  • Updating the frequency of a value
  • Querying how many inserted numbers are less than or equal to a value

Step 2: Process Instructions Left to Right

We iterate through each value x in instructions.

At this moment, the Fenwick Tree contains frequencies of all previously inserted values.

Step 3: Count Smaller Elements

To count numbers strictly smaller than x, we query the prefix sum up to x - 1.

less = query(x - 1)

This tells us how many inserted numbers are smaller than x.

Step 4: Count Greater Elements

To count numbers strictly greater than x, we first compute how many numbers are less than or equal to x.

not_greater = query(x)

If we have already inserted i elements, then:

greater = i - not_greater

Step 5: Add the Minimum Cost

The insertion cost is:

min(less, greater)

We add this value to the running total.

Step 6: Insert the Current Value

We update the Fenwick Tree frequency for x.

update(x, +1)

This ensures future queries include the current number.

Step 7: Return the Result

Since the answer may become large, we return:

answer % (10^9 + 7)

Why it works

The Fenwick Tree always stores the exact frequency distribution of all previously inserted values.

Because prefix sums efficiently count how many numbers are less than or equal to a value, we can derive both:

  • Strictly smaller counts
  • Strictly greater counts

for every insertion.

Since every step computes the exact insertion cost defined by the problem, the accumulated result is correct.

Python Solution

from typing import List

class FenwickTree:
    def __init__(self, size: int):
        self.size = size
        self.tree = [0] * (size + 1)

    def update(self, index: int, delta: int) -> None:
        while index <= self.size:
            self.tree[index] += delta
            index += index & -index

    def query(self, index: int) -> int:
        total = 0

        while index > 0:
            total += self.tree[index]
            index -= index & -index

        return total

class Solution:
    def createSortedArray(self, instructions: List[int]) -> int:
        MOD = 10**9 + 7

        max_value = max(instructions)
        fenwick = FenwickTree(max_value)

        total_cost = 0

        for i, value in enumerate(instructions):
            less = fenwick.query(value - 1)
            less_or_equal = fenwick.query(value)

            greater = i - less_or_equal

            total_cost += min(less, greater)
            total_cost %= MOD

            fenwick.update(value, 1)

        return total_cost

The implementation begins with a Fenwick Tree class. The update method increases the frequency of a value, while the query method returns the prefix sum from 1 through index.

Inside createSortedArray, we first determine the maximum value so we know how large the Fenwick Tree must be.

As we iterate through the instructions:

  • less counts numbers strictly smaller than the current value.
  • less_or_equal counts numbers less than or equal to the current value.
  • greater is derived from how many values have already been inserted.

The insertion cost is added to the running answer, and then the current value is inserted into the Fenwick Tree.

Modulo arithmetic is applied after every addition to avoid overflow.

Go Solution

package main

func createSortedArray(instructions []int) int {
	const MOD int = 1_000_000_007

	maxValue := 0
	for _, value := range instructions {
		if value > maxValue {
			maxValue = value
		}
	}

	fenwick := make([]int, maxValue+1)

	update := func(index int, delta int) {
		for index <= maxValue {
			fenwick[index] += delta
			index += index & -index
		}
	}

	query := func(index int) int {
		sum := 0

		for index > 0 {
			sum += fenwick[index]
			index -= index & -index
		}

		return sum
	}

	totalCost := 0

	for i, value := range instructions {
		less := query(value - 1)
		lessOrEqual := query(value)

		greater := i - lessOrEqual

		if less < greater {
			totalCost += less
		} else {
			totalCost += greater
		}

		totalCost %= MOD

		update(value, 1)
	}

	return totalCost
}

The Go implementation follows the exact same logic as the Python solution.

Instead of defining a separate Fenwick Tree class, the tree is represented as a slice with local helper functions for update and query.

Go uses explicit integer arithmetic, so modulo operations are applied carefully after each addition. Since the constraints fit within 32 bit integers during intermediate operations, standard int is sufficient.

Worked Examples

Example 1

Input:

instructions = [1,5,6,2]
Step Value Less Greater Cost Fenwick Frequencies
1 1 0 0 0 {1:1}
2 5 1 0 0 {1:1,5:1}
3 6 2 0 0 {1:1,5:1,6:1}
4 2 1 2 1 {1:1,2:1,5:1,6:1}

Total cost:

0 + 0 + 0 + 1 = 1

Example 2

Input:

instructions = [1,2,3,6,5,4]
Step Value Less Greater Cost
1 1 0 0 0
2 2 1 0 0
3 3 2 0 0
4 6 3 0 0
5 5 3 1 1
6 4 3 2 2

Total cost:

3

Example 3

Input:

instructions = [1,3,3,3,2,4,2,1,2]
Step Value Less Greater Cost
1 1 0 0 0
2 3 1 0 0
3 3 1 0 0
4 3 1 0 0
5 2 1 3 1
6 4 5 0 0
7 2 1 4 1
8 1 0 6 0
9 2 2 4 2

Total cost:

4

Complexity Analysis

Measure Complexity Explanation
Time O(n log M) Each update and query operation on the Fenwick Tree costs O(log M)
Space O(M) The Fenwick Tree stores frequencies up to the maximum value

Here:

  • n is the number of instructions
  • M is the maximum value in instructions

For each element, we perform:

  • Two prefix sum queries
  • One update

Each operation takes logarithmic time in the value range.

Test Cases

sol = Solution()

assert sol.createSortedArray([1,5,6,2]) == 1
# basic example from statement

assert sol.createSortedArray([1,2,3,6,5,4]) == 3
# increasing then decreasing

assert sol.createSortedArray([1,3,3,3,2,4,2,1,2]) == 4
# duplicates with mixed ordering

assert sol.createSortedArray([1]) == 0
# single element

assert sol.createSortedArray([1,2,3,4,5]) == 0
# already sorted ascending

assert sol.createSortedArray([5,4,3,2,1]) == 0
# descending order, minimum always zero

assert sol.createSortedArray([2,2,2,2]) == 0
# all equal elements

assert sol.createSortedArray([3,1,2]) == 1
# small mixed example

assert sol.createSortedArray([5,1,5,1,5,1]) == 0
# alternating extremes

assert sol.createSortedArray([2,1,3,1,2,3,2]) == 3
# repeated insertions with varying costs

Test Summary

Test Why
[1,5,6,2] Validates the first official example
[1,2,3,6,5,4] Tests increasing then decreasing pattern
[1,3,3,3,2,4,2,1,2] Verifies handling of duplicates
[1] Smallest possible input
[1,2,3,4,5] Already sorted ascending case
[5,4,3,2,1] Reverse sorted case
[2,2,2,2] All duplicates
[3,1,2] Small mixed ordering
[5,1,5,1,5,1] Alternating large and small values
[2,1,3,1,2,3,2] Complex repeated counting behavior

Edge Cases

All Elements Equal

An input like:

[2,2,2,2]

is easy to mishandle because the problem requires strictly less and strictly greater comparisons.

Equal values should not contribute to either count.

The implementation correctly handles this because:

  • less = query(value - 1)
  • greater = inserted - query(value)

Equal values are excluded from both calculations.

Strictly Increasing Input

For an input like:

[1,2,3,4,5]

every new value has:

  • many smaller elements
  • zero greater elements

Since the insertion cost uses the minimum, every insertion contributes 0.

The algorithm naturally handles this because greater becomes zero at every step.

Strictly Decreasing Input

For an input like:

[5,4,3,2,1]

every insertion has:

  • zero smaller elements
  • many greater elements

Again, the minimum is always 0.

This case is important because naive implementations sometimes incorrectly include equal or current elements in the counts.

The Fenwick Tree only stores previously inserted values, so the current element is never counted against itself.

Large Inputs With Maximum Values

The constraints allow:

instructions.length = 100000
instructions[i] = 100000

A naive insertion-based approach would become too slow.

The Fenwick Tree guarantees efficient logarithmic operations even at the maximum constraint size, making the solution scalable enough for all valid inputs.