Reputation: 1065
Suppose an array arr[1..N] of integers. The array is changed in two ways:
-1
, for example)I am looking for a data structure that gives answer to "What is the k-th valid item?" quickly.
With single invalid value, this is trivial. With many such values, it becomes a bit of a tangle.
I tried keeping an auxiliary array, which would contain 1
for invalid indices and 0
otherwise. I can keep prefix sums for this array and query them in O(log n) time. Then I know that there are sum(k) invalid items between 1 and k, so that the k-th item is actually at k + sum(k). But, between k and k + sum(k), there may be some invalid items again - and it gets ugly.
Is there a better solution to this problem, presumably in O(log n) time?
Have a nice day!
Upvotes: 3
Views: 75
Reputation: 95298
I tried keeping an auxiliary array, which would contain 1 for invalid indices and 0 otherwise. I can keep prefix sums for this array and query them in O(log n) time. Then I know that there are sum(k) invalid items between 1 and k, so that the k-th item is actually at k + sum(k). But, between k and k + sum(k), there may be some invalid items again - and it gets ugly.
You can use a binary-indexed tree to maintain prefix sums and do select queries in the auxiliary data structure fast:
int tree[N+1] = {0}; // initially everything is invalid
void validate(int i, bool v=1) { // v = 1: validate, v = 0: invalidate
for (; i <= N; i += i & -i)
tree[i] += v ? 1 : -1;
}
int kth(int k) { // find kth valid index (argument and result 1-based)
int ans = 0;
for (int i = logN; i >= 0; --i) // logN = largest i s.t. (1<<i) <= N
if (ans + (1<<i) <= N && tree[ans + (1<<i)] < k) {
ans += 1<<i;
k -= tree[ans];
}
return ans+1;
}
// validate everything
for (int i = 1; i <= N; ++i)
validate(i);
Both operations are O(log n) and considerably faster than binary search trees due to a lower constant factor. It's magic.
Upvotes: 1
Reputation: 55599
If you were to store just the valid indices (the actual indices, not the elements at those indices) in an order statistics tree (in addition to your array), that would allow getting the k-th item in O(log n) time.
An order statistics tree is basically a binary search tree (specifically a self-balancing one to get the required performance), but each node stores one additional value, which is the size of the subtree rooted at that node (i.e. the number of nodes below it).
This would similarly allow invalidation in O(log n) time with a simple BST delete operation.
Overwriting would take O(1) time - no changes to the tree are required.
Having both an array and the order statistics tree isn't strictly necessary - you could have the tree node contain the elements as well, but this would make overwriting take O(log n) instead of O(1).
Upvotes: 1
Reputation: 80187
You can use order statistics tree (for example, based on RB or other balanced search tree) containing indexes. Both deletion and search will take O(logN) time
Upvotes: 1