Rupasai Rangaraju
Rupasai Rangaraju

Reputation: 1

Why we need two fenwick tree to get range sum of [l, r] given there are array of numbers and range updates and queries are performed on the array?

Given array of numbers of size n. We will perform 2 operations such as:

Update: update range of numbers (l, r)
Query: sum of numbers (l, r).
To compute range update, range queries can we use only one Fenwick tree:
Update: [l,r,v] : update(l,v) and update(r+1,-v)
Query: [l,r] : query(r)-query(l-1)
But the usual implementation uses 2 fenwick trees.
Can some one point out What am I missing here?

Upvotes: 0

Views: 291

Answers (1)

Jakube
Jakube

Reputation: 3565

Definition of a Fenwick Tree

The update function of a Fenwick Tree increases one value in the corresponding array. E.g. update(i, v) increases the value of the ith element by v.

And with the query function you get the sum of a prefix, e.g. query(r) is the sum of the first r values of the array. That allows also range queries via query(r) - query(l-1).

What your updates/queries compute

But exactly that property also goes the other way, by the definition of a Fenwick tree, if query(r) should give the prefix sum of the first r elements, then every element must be set exactly by their position. I.e. you increase the value of the ith element with query(i, v).

So all that your "wrong range" update does, is increase the value of the lth value and decreases the value of the (r+1)th value.

E.g. look at an example, an empty array of length 5 with the update [1, 4, 10] and the query [2, 3].

At the beginning we have [0, 0, 0, 0, 0]. After both updates we have [10, 0, 0, 0, -10]. And the query gives query(3) - query(2-1) = 10 - 10 = 0, even though it should be 20.

If you want that query(r) - query(l), because what you wanted would have been the array [10, 10, 10, 10, 0] after the update.

Trick 1: Range queries and point updates

What you are maybe mixing up is the following trick:

Given the original array a, instead of storing the value a[i] using update(i, a[i]), which gets us the prefix query(i) = a[1] + a[2] + ... + a[i], you could do the opposite.

You define a second array b such that b[i] := a[i] - a[i-1] (with a[0] = 0).

That trick gives us a couple of interesting properties.

  • We have query(i) = a[i]. It's no longer the range sum, it's the value of the original array.

    Proof: query(i) = b[1] + b[2] + ... + b[i] = (a[1] - a[0]) + (a[2] - a[1]) + ... + (a[i] - a[i-1]) = a[i] - a[0] = a[i].

  • We can suddenly increase all values of a suffix at once with just one update. If we perform update(i, v), and look at a query(j) with j <= i, we still get the old value:

    query(j) = b[1] + b[2] + ... + b[j] = ... = a[j].

    But if we look at a value a query(j) with j <= i, we get the increase value:

    query(j) = b[1] + ... + b[i] + v + ... + b[j] = a[j] + v.

  • And by performing two updates, update(l, v) and update(r+1, -v), you can increase an arbitrary range [l, r].

But notice, that query(i) gives us the ith element of the original array. And query(r) - query(l-1) is then the difference between the rth and the (l-1)th element of the array, nothing more. It doesn't give us any range sums any more.

Trick 2: range updates and range queries

There are tricks, on how to perform range updates and range queries.

But it's more complicated. The big problem is, that if you update a range of numbers [l, r], and you want to query another point, you no longer know how much of those numbers you increased. If you want the update to work, you can only change a constant number of values in the corresponding array, i.e. one per update call. But if you look at an array like [x, 0, 0, 0, -y], and make a query like query(3), then you only look at a subset of values, and determine how much of that -y should be considered into the range of the first 3 numbers. If you store y = sum of range updates, you can't split it any more, if you only store the single update, you don't know how much you multiply it with. There's no way.

The trick is to use two trees. Imagine we start with more simplified versions of the problem.

Part 1

We want the following two operations:

  1. We want to every value of the array a by v.
  2. We want to compute the prefix of the array: a[1] + ... + a[i].

We could just sum up all vs, and multiply them by i.

Part 2

Now let's make it a bit harder. Instead of increasing every value of the array a by v, we increase every value of the suffix a[j], a[j+1], ... by v.

What happens with the sum, if we still compute v * i.

  • For i < j we want 0.
  • For i >= j we want v * i - v * (j-1).

In other words we want two functions foo and bar, such that if we call foo(j, v) we get:

  • For i < j we want bar(i) = 0
  • For i >= j we want bar(j) = v.

Because then our sums are already almost correct.

We can define the prefix sum of the first i elements as bar(i) * i - some_constant, where some_constant = 0 for i < j and some_constant = v * (j-1) for i >= j. I'll call the second constant excess. Notice that this is indeed a constant, it doesn't depend on i.

How can we find such two functions? Well, it's just Trick 1. You create a Fenwick tree, and make range updates on the suffix, and then perform point queries.

As summary: a prefix query is basically the (sum of all suffix updates that you performed from i or some value before) multiplied with the prefix length (to get the range) and subtract some of the constant excesses from before. If a suffix updates changed all but the first 2 values by v, you subtract 2*v. If another suffix update changed all but the first 5 values by w, you subtract 5*w.

Not, how can you add up all those excesses? Via a normal Fenwick tree that sum them all up.

And that's how you get two Fenwick trees.

(And once you can perform arbitrary suffix range updates, you can obviously also perform arbitrary range updates).

Upvotes: 2

Related Questions