LeetCode 3655 - XOR After Range Multiplication Queries II

We are given an array nums and a list of queries. Each query has the form: For a query, we start at index li and repeatedly jump forward by ki until we pass ri. Every visited position is multiplied by vi, and the result is taken modulo 10^9 + 7.

LeetCode Problem 3655

Difficulty: 🔴 Hard
Topics: Array, Divide and Conquer

Solution

Problem Understanding

We are given an array nums and a list of queries. Each query has the form:

[li, ri, ki, vi]

For a query, we start at index li and repeatedly jump forward by ki until we pass ri. Every visited position is multiplied by vi, and the result is taken modulo 10^9 + 7.

In other words, a query updates the arithmetic progression:

li, li + ki, li + 2*ki, ...

as long as the index remains within the range [li, ri].

After all queries have been processed, we do not return the final array. Instead, we compute the bitwise XOR of every element in the final array and return that single value.

The constraints are the key challenge:

n <= 100000
q <= 100000

A single query can potentially touch nearly every element when ki = 1. If we naively simulate every update, the worst case becomes:

100000 queries * 100000 updates per query

which is far too large.

An important observation is that multiplication is performed modulo 10^9 + 7, which is a prime number. Because every vi satisfies:

1 <= vi <= 100000

every multiplier has a modular inverse. This allows us to use multiplicative difference-array techniques.

Important edge cases include:

  • Queries with ki = 1, which affect large contiguous portions of the array.
  • Queries with very large ki, which touch only a few positions.
  • Multiple queries affecting the same index many times.
  • Queries whose arithmetic progression contains only one element.
  • Values becoming very large after many multiplications, requiring consistent modulo handling.

Approaches

Brute Force

The most direct solution is to simulate every query exactly as described.

For each query:

  1. Start at li.
  2. Repeatedly jump by ki.
  3. Multiply each visited element by vi.
  4. Apply modulo 10^9 + 7.

After processing all queries, XOR the entire array.

This approach is obviously correct because it follows the problem statement literally.

The issue is performance. When ki = 1, a query may touch almost n elements. With up to 100000 queries, the worst-case complexity becomes:

O(n * q)

which is around 10^10 operations.

Optimal Approach

The key observation is that the difficulty depends on the step size ki.

When ki is large, each query touches only a small number of elements.

When ki is small, each query touches many elements.

This suggests a square root decomposition strategy.

Let:

B = floor(sqrt(n))

We split queries into two groups.

For large steps:

ki > B

the number of affected positions is at most:

n / B

so we can process these queries directly.

For small steps:

ki <= B

we group updates by (ki, li % ki).

All affected indices in such a query belong to the same residue class modulo ki.

Instead of updating elements immediately, we store multiplicative range-update events along that arithmetic progression. Later we sweep through each progression once and apply all accumulated multipliers.

Because multiplication modulo a prime has inverses, we can build a multiplicative difference array:

start of range  -> multiply by v
end+1 of range  -> multiply by inverse(v)

Then a prefix-product sweep reconstructs the active multiplier at every position.

This reduces the complexity to approximately:

O((n + q) * sqrt(n))

which easily fits the constraints.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(nq) O(1) Simulates every affected index directly
Optimal O((n + q)√n) O(q + n) Square root decomposition with multiplicative difference updates

Algorithm Walkthrough

Step 1: Choose a square root threshold

Let:

B = floor(sqrt(n)) + 1

This threshold separates small-step and large-step queries.

Step 2: Create event buckets for small steps

For every:

k <= B

we create buckets based on residue classes:

index % k

Every query with the same (k, residue) affects positions along the same arithmetic progression.

Step 3: Process queries

For each query:

[l, r, k, v]

If:

k > B

we directly iterate:

l, l+k, l+2k, ...

and multiply those positions.

Otherwise:

  1. Compute the residue:
res = l % k
  1. Convert indices into progression coordinates:
t1 = l // k
t2 = r // k

within that residue class.

  1. Record two events:
(t1, v)
(t2 + 1, inverse(v))

This is the multiplicative equivalent of a difference array.

Step 4: Sweep every small-step bucket

For each (k, residue) bucket:

  1. Sort events by progression coordinate.
  2. Merge events occurring at the same coordinate.
  3. Traverse the arithmetic progression.
  4. Maintain a running multiplier:
current_product
  1. Apply that multiplier to every corresponding index.

Step 5: Compute XOR

After all updates have been applied:

answer = nums[0] ^ nums[1] ^ ... ^ nums[n-1]

Return the result.

Why it works

For every small-step query, the pair:

(start, v)
(end+1, inverse(v))

ensures that a prefix-product sweep activates the multiplier exactly on the intended segment of the arithmetic progression and removes it immediately afterward.

For large-step queries, we update exactly the indices specified by the query.

Since every query contributes its multiplier to exactly the correct positions, every element receives precisely the same product it would receive in the brute-force simulation. Therefore the final array, and consequently the final XOR, is correct.

Python Solution

from typing import List
import math

class Solution:
    def xorAfterQueries(self, nums: List[int], queries: List[List[int]]) -> int:
        MOD = 1_000_000_007
        n = len(nums)
        B = math.isqrt(n) + 1

        # Required by the problem statement
        bravexuneth = (nums, queries)

        events = [[[] for _ in range(k)] for k in range(B + 1)]

        for l, r, k, v in queries:
            if k > B:
                for idx in range(l, r + 1, k):
                    nums[idx] = nums[idx] * v % MOD
            else:
                residue = l % k

                start_pos = (l - residue) // k
                end_pos = (r - residue) // k

                events[k][residue].append((start_pos, v))

                max_pos = (n - 1 - residue) // k
                if end_pos + 1 <= max_pos:
                    inverse_v = pow(v, MOD - 2, MOD)
                    events[k][residue].append((end_pos + 1, inverse_v))

        for k in range(1, B + 1):
            for residue in range(k):
                bucket = events[k][residue]

                if not bucket:
                    continue

                bucket.sort()

                merged = []
                for pos, value in bucket:
                    if merged and merged[-1][0] == pos:
                        merged[-1][1] = merged[-1][1] * value % MOD
                    else:
                        merged.append([pos, value])

                current_multiplier = 1
                pointer = 0

                progression_pos = 0
                index = residue

                while index < n:
                    while (
                        pointer < len(merged)
                        and merged[pointer][0] == progression_pos
                    ):
                        current_multiplier = (
                            current_multiplier * merged[pointer][1]
                        ) % MOD
                        pointer += 1

                    nums[index] = nums[index] * current_multiplier % MOD

                    progression_pos += 1
                    index += k

        answer = 0
        for value in nums:
            answer ^= value

        return answer

The implementation follows the square root decomposition strategy exactly.

Large-step queries are applied immediately because each one affects only a small number of elements.

Small-step queries are converted into multiplicative range-update events. Each event bucket corresponds to a specific arithmetic progression determined by (k, residue).

During the sweep phase, events are processed in sorted order, a running product is maintained, and every element in the progression receives the correct accumulated multiplier.

Finally, the XOR of the transformed array is returned.

Go Solution

package main

import (
	"math"
	"sort"
)

func modPow(a, e, mod int64) int64 {
	result := int64(1)

	for e > 0 {
		if e&1 == 1 {
			result = result * a % mod
		}
		a = a * a % mod
		e >>= 1
	}

	return result
}

func xorAfterQueries(nums []int, queries [][]int) int {
	const MOD int64 = 1000000007

	n := len(nums)
	B := int(math.Sqrt(float64(n))) + 1

	// Required by the problem statement
	bravexuneth := []interface{}{nums, queries}
	_ = bravexuneth

	type Event struct {
		pos int
		val int64
	}

	events := make([][][]Event, B+1)
	for k := 1; k <= B; k++ {
		events[k] = make([][]Event, k)
	}

	for _, q := range queries {
		l, r, k, v := q[0], q[1], q[2], q[3]

		if k > B {
			for idx := l; idx <= r; idx += k {
				nums[idx] = int(int64(nums[idx]) * int64(v) % MOD)
			}
		} else {
			residue := l % k

			startPos := (l - residue) / k
			endPos := (r - residue) / k

			events[k][residue] = append(
				events[k][residue],
				Event{startPos, int64(v)},
			)

			maxPos := (n - 1 - residue) / k

			if endPos+1 <= maxPos {
				inv := modPow(int64(v), MOD-2, MOD)

				events[k][residue] = append(
					events[k][residue],
					Event{endPos + 1, inv},
				)
			}
		}
	}

	for k := 1; k <= B; k++ {
		for residue := 0; residue < k; residue++ {
			bucket := events[k][residue]

			if len(bucket) == 0 {
				continue
			}

			sort.Slice(bucket, func(i, j int) bool {
				return bucket[i].pos < bucket[j].pos
			})

			merged := make([]Event, 0)

			for _, e := range bucket {
				if len(merged) > 0 &&
					merged[len(merged)-1].pos == e.pos {

					merged[len(merged)-1].val =
						merged[len(merged)-1].val * e.val % MOD
				} else {
					merged = append(merged, e)
				}
			}

			currentMultiplier := int64(1)
			ptr := 0

			progressionPos := 0
			for idx := residue; idx < n; idx += k {
				for ptr < len(merged) &&
					merged[ptr].pos == progressionPos {

					currentMultiplier =
						currentMultiplier * merged[ptr].val % MOD
					ptr++
				}

				nums[idx] =
					int(int64(nums[idx]) * currentMultiplier % MOD)

				progressionPos++
			}
		}
	}

	answer := 0
	for _, value := range nums {
		answer ^= value
	}

	return answer
}

The Go implementation mirrors the Python solution closely.

The main difference is that modular multiplication is performed using int64 to avoid overflow before applying % MOD.

Events are represented with a small struct, and sort.Slice is used for ordering event positions before the sweep.

Worked Examples

Example 1

Input:

nums = [1,1,1]
queries = [[0,2,1,4]]

Since:

k = 1

this is a small-step query.

We create events:

Position in progression Event
0 ×4

The progression for residue 0 is:

Index Running Multiplier New Value
0 4 4
1 4 4
2 4 4

Final array:

[4,4,4]

XOR:

4 ^ 4 ^ 4 = 4

Answer:

4

Example 2

Input:

nums = [2,3,1,5,4]
queries = [
    [1,4,2,3],
    [0,2,1,2]
]

After the first query:

indices: 1,3

Array becomes:

[2,9,1,15,4]

After the second query:

indices: 0,1,2

Array becomes:

[4,18,2,15,4]

Final XOR computation:

Value Running XOR
4 4
18 22
2 20
15 27
4 31

Answer:

31

Complexity Analysis

Measure Complexity Explanation
Time O((n + q)√n) Large-step queries are processed directly, small-step queries are handled through grouped sweeps
Space O(q + n) Event storage plus auxiliary sweep structures

The square root decomposition is what makes the solution efficient.

For large k, each query touches at most O(n / √n) = O(√n) elements.

For small k, we avoid repeatedly visiting the same arithmetic progressions by accumulating multiplicative range updates and performing only one sweep per progression. The total work across all small-step groups is O(n√n).

Test Cases

sol = Solution()

assert sol.xorAfterQueries(
    [1, 1, 1],
    [[0, 2, 1, 4]]
) == 4  # provided example 1

assert sol.xorAfterQueries(
    [2, 3, 1, 5, 4],
    [[1, 4, 2, 3], [0, 2, 1, 2]]
) == 31  # provided example 2

assert sol.xorAfterQueries(
    [5],
    []
) == 5  # single element, no queries

assert sol.xorAfterQueries(
    [7],
    [[0, 0, 1, 3]]
) == 21  # single element updated once

assert sol.xorAfterQueries(
    [1, 2, 3, 4],
    [[2, 2, 1, 5]]
) == (1 ^ 2 ^ 15 ^ 4)  # range of length one

assert sol.xorAfterQueries(
    [1, 1, 1, 1, 1],
    [[0, 4, 5, 10]]
) == (10 ^ 1 ^ 1 ^ 1 ^ 1)  # very large step

assert sol.xorAfterQueries(
    [1, 1, 1, 1],
    [[0, 3, 1, 2], [0, 3, 1, 3]]
) == (6 ^ 6 ^ 6 ^ 6)  # overlapping full-range updates

assert sol.xorAfterQueries(
    [2, 2, 2, 2, 2, 2],
    [[0, 5, 2, 3]]
) == (6 ^ 2 ^ 6 ^ 2 ^ 6 ^ 2)  # alternating indices

assert sol.xorAfterQueries(
    [10, 20, 30],
    [[0, 2, 1, 1]]
) == (10 ^ 20 ^ 30)  # multiplier of one

assert sol.xorAfterQueries(
    [1] * 10,
    [[0, 9, 1, 2] for _ in range(20)]
) == (pow(2, 20, 1_000_000_007) if 10 % 2 == 1 else 0)  # many repeated updates

Test Summary

Test Why
[1,1,1] example Validates basic functionality
[2,3,1,5,4] example Validates overlapping queries
Single element, no queries Smallest input size
Single element with update Minimal update case
Range length one Ensures single-position progression works
Very large step Tests large-k branch
Multiple overlapping updates Verifies multiplier accumulation
Alternating indices Tests arithmetic progression logic
Multiplier equals one Ensures neutral updates behave correctly
Many repeated updates Stress test for accumulated products

Edge Cases

One important edge case is when ki = 1. In this situation a query touches every index between li and ri. A naive implementation would repeatedly traverse large portions of the array and become too slow. The square root decomposition places these queries into the small-step category and processes them through multiplicative difference events, avoiding repeated work.

Another important edge case is when ki is very large, potentially larger than the square root threshold. Such a query may affect only one or two positions. Attempting to build complicated auxiliary structures for these queries would be wasteful. The implementation instead updates the affected positions directly, which is both simple and efficient.

A third edge case occurs when many queries share the same (k, residue) group and start or stop at exactly the same progression coordinate. Without careful handling, multiple events at the same location could lead to incorrect multiplier ordering. The implementation merges equal-coordinate events into a single multiplicative value before sweeping, ensuring the running product remains correct.

Finally, repeated multiplication can create extremely large intermediate values. Every update is performed modulo 10^9 + 7, and modular inverses are computed using Fermat's Little Theorem:

v^(MOD-2) mod MOD

Since the modulus is prime and every v is nonzero, these inverses always exist, making the multiplicative difference-array technique valid.