dhruvsharma
dhruvsharma

Reputation: 31

Multiplication in a range of numbers while ignoring duplicates

Given an array and some q range queries .

In a single query i want to print the product the numbers in a range of indexes while ignoring duplicates , indexing start from 1 .

For example provided the numbers are :

8 1 3 2 4 1 3 3 1

For a range query from index 1 to 8 : product will be 1*2*3*4 .i.e 24 . For a range query from index 6 to 7 : product will be 3 .

Is it possible to answer range queries in log(n) using segment tree or any pre processing or are there any tricks for this using xor or anyother method .

Upvotes: 0

Views: 285

Answers (1)

ronalchn
ronalchn

Reputation: 12335

Yes, it is possible to get O(log N) per query, but the explanation is going to be a bit long...

This problem can be described as a "unique range product query". This is similar to the "unique range sum query" problem, where you want the sum, rather than the product or unique elements.

The unique range sum query is similar to the "count of distinct elements in range" problem, a solution which is described here. The post isn't easy to understand (particularly if you haven't done many algorithmic programming contests at a high level), so I will explain the solution to the "count of distinct elements in range" in more detail here.

Count distinct elements in range

Suppose you have values = [1, 2, 3, 1, 2, 1, 3]. For each value, you want to find the next position with the same value. This is nextposes = [4, 5, 7, 6, inf, inf, inf]. Suppose you have query = [2, 6], nextposes = [5, 6, 7, inf, inf], 3 of them > 6, so answer is 3.

That's the easy part where I've copied the example from the post. This gives the idea of what we want to work out, but not in an easy to understand way. The next idea is putting it in a 2D tree. The post doesn't explain how this works, so I'll explain it here.

The 2D tree is a segment tree of segment trees. Therefore, you need to build a segment tree, and in each vertex of the first segment tree, build another segment tree. You also have to work out what to put in the segment tree. From the example above, the 2D segtree describes a 2D plane. You have many points in the plane, and you want to insert them into the structure. The points from the example are (1, 4), (2, 5), (3, 7), (4, 6), (5, inf), (6, inf), (7, inf). That is (x,y) = (array position, next array position) . If you plot these on the plane, you will realize that for each query, you have to count the number of points in a rectangular region. In the example given, you want to count the number of points where 2 <= x <= 6, y > 6. If you traverse your segment tree, it costs O(log^2 N) to do this. To update the tree, you also traverse the segment tree, and the cost is the same.

Optimizations

If you do a naive 2D segment tree, it is going to cost you O(N*max(V)) memory (where max(V) is the maximum possible integer value), and possibly the same about of time to get this memory. In fact, you don't need so much memory. You can build a lazily initialized 2D segment tree. This means that you start with a single node at the top of the tree to represent the whole range, and only split it when you need to (because you are inserting something in the tree). This will cut down the memory cost to O(N log^2 (N)).

Unique range sum query

To get the sum, you can weight each vertex inserted into the 2D tree. Don't think you need the treap as mentioned in the post.

Unique range product query

Instead of additions, each time you want to add, just multiply instead.

Finally, the algorithmic complexity is O((N + Q) log^2 N). At O(log^2 N) per query, this is not quite as good as log(N) per query that you asked for. I will now explain how you can get O(log N) per query.

O(log N) per query

To achieve this, we can use persistent data structures. A persistent data structure is one where you can modify it (ie. make an update), and you will get a new structure with the update, and you also get to keep the old structure before the update (a bit like immutable strings). Persistent structures, however, do this in an optimized way, because the old and new data structures will share most of their contents. For a segment tree, suppose you want to make an update to a position. You only need to create new nodes on the path to the new position (ie. O(log N) nodes need to be re-created). The remaining nodes can be the same ones as in the old data structure.

Now we know what persistent data structures are, the idea is to take the (x,y) coordinates we inserted into the 2D tree earlier. This time, we want to sort them by the y-coordinate descending. Then we insert the points into a 1D persistent segment tree in this order. The cost of each insertion is O(log N).

We keep an array of pointers to the root nodes of the segment tree after each insertion. Therefore, we have 1 segment tree for each constraint where y > some number.

Now to query the rectangular region 2 <= x <= 6, y > 6 as before, we first find the segment tree representing y > 6 (just before we insert the point (4,6)). Then, we do the range-product-query of the range [2,6] which costs O(log N) time.

Since building the N persistent segment trees cost O(N log N), and each query is O(log N), the total time complexity is O((N + Q) log N)!

The implementation of this solution is quite involved, because you will need to make a persistent, lazily initialized, segment tree. However, I do think that coding up a 1D persistent segment tree is easier than a 2D segment tree.

Upvotes: 1

Related Questions