Reputation: 3089
I have a problem which after some modifications reduces to "Finding the least index of number greater than x in a range[l, r] "
For example: Suppose an array A = {1, 2, 3, 6, 9, 8, 4, 3, 7, 6, 2}
And the query is "find least index of element in array A in range [2, 6] which is greater or equal to 5"
Answer for the above query is 4(value for this index is 6)(Indices are 1 based)
There are multiple queries, array is not sorted(consider that input is already in memory)
Is there an algorithm in which query is possible in O(logN) where N is no. of elements in array A.
Upvotes: 6
Views: 3989
Reputation: 59253
There are actually a bunch of ways to support queries in O(log N) time after building a data structure that takes O(N) space.
To make the above algorithm really efficient, you can encode the tree into an array, like we do for heaps. In this representation (using 1-based indexes), you have an array containing the maximum values for N-1 internal nodes, followed by the N leaves in order. Call that array H
. Then the children of H[i]
are at H[i*2]
and H[i*2+1]
. The parent of H[i]
is at H[i>>1]
In pseudocode, using 1-based indexes, we are given:
A[] = input array, N = input array size
We build H like this:
H = new array with size N*2-1, indexed from 1 to N*2-1
for (int i=1; i<=N; ++i)
H[i+N-1]=A[i];
for (int i=N-1; i>0; --i)
H[i] = max(H[2*i],H[2*i+1]);
Note that we create the children before the parents so that the children are there when we need to get the maximum of their values.
Now, the query function:
//get the index of the first element with val >= minval, index >= minindex, and index <= maxindex
//returns -1 if there is no such element
firstAtLeast(minval, minindex, maxindex)
if (maxindex < minindex)
return -1;
node = minindex+N-1; //find minindex in the tree
//go up and right until we find a subtree that has a value >= minval
while(H[node] < minval)
//if we are a right child of our parent, go up until
//we have a right sibling
while( (node&1) == 1 ) //node is odd
node = node>>1; //same as floor(node/2);
if (node <= 1)
//we went up to the root
//there is no such element
return -1;
//now node is a left child. try its right sibling
++node;
//We found a subtree. get the first valid leaf
while(node < N) //while it's an internal node
node = 2*node; //left child of the node
if (H[node] < minval)
++node; //left child not valid - move to right child
//Found leaf. get index in A[i] and check against maxindex
index = node-(N-1);
return (index <= maxindex ? index : -1);
This satisfies the requirement for queries in O(log N) time. It would be nice (and not too difficult) to exit early when you know there won't be an answer less than maxindex
, but that would make the pseudocode a little less clear, so I'll leave it as an exercise
Upvotes: 5
Reputation: 21278
If you have lots of queries, with a moderate amount of data, you may be able to get a decent speed-up on your queries by O(N) extra storage.
Create an aray of tuples (a[i], i)
(that is, value in the array, index of that value), sorted by the first (and in cases of conflict, the second) in ascending order. Then, use binary search to find your starting point. If the index is outside your range, continue to traverse your sorted list, until you find an index that falls within your range of interest.
I suspect this algorithm is O(N), worst case, so I guess it is possible to do better.
Upvotes: 0
Reputation: 10151
O(logN) seams to be impossible. You need at least to read the input until the first element which is greater (this might be the last one or none at all). So in the worst case you need to read the whole input which means O(N).
Improvements are only possible if there is some extra structure of your input like sorted than you could improve the algorithm to O(logN).
If there are multible queries you still need O(logN). You can check for multible queries at once and also cache the results for the case that the same queries comes again.
Upvotes: 4
Reputation: 22932
If the number of possible elements is small (say K) and can be readily enumerated, for an array of N elements, you can sort them in order N+K using a counting sort. You can then use a binary search for your query, which will be order log N. Note the counting sort will also require order K memory, and hence is only useful where a relatively small number of discrete keys are in play. If you have Q queries, the complexity is O((N+K) + Q(log(N))
Upvotes: 1