Reputation: 632
Devise a data structure that supports the following operations on an n-element set:
O(lg n)
O(1)
time. Note that delete max takes O(lg n)
in a conventional heap.My approach:
I decided that I will keep a seperate array, which will keep track of potential successors that will overtake the root(which is the largest value, in the max heap); once it is deleted. so if you delete the max, in O(1)
time I will look in my array to find the next suitable successor(which I assume I will have intelligently set up).
Anybody have any better approaches? Try to stick to using heaps. This is NOT a Home-Work question, I am preparing for an interview, and its from Skiena's book.
Upvotes: 9
Views: 4387
Reputation: 608
The easiest solution is to maintain always sorted collection using skiplist with insertion time O(logn), then you can remove either max or min element in O(1), since it would be either head or tail of the list.
Upvotes: 0
Reputation: 1495
Your requirements are very specific -- you're interested in only these two operations: insert O(lg n) and deleteMin O(1).
None of the known heap structures satisfies your specific constraints. Instead, the best structure (albeit theoretically -- Galactic structures as some would call them) is the Brodal Queue, which performs all the heap operations in O(1) worst-case time, except deleteMin which still takes O(lg n) worst-case time. All other known structures are no better.
Since you're only interested in only these two operations, you might be able to get away with a custom structure that only handles these two operations well, since you don't have to worry about decrease-key, merge, etc. that more general heap structures have to worry about.
Maintaining a dual structure, DL, containing:
Further, maintain a link between each entry in L and its corresponding entry in D or T and vice versa. Further, you'll need to maintain a bit for each entry of D indicating whether it's been deleted in L or not. In order to later on do a mass synchronization between D and L, you might also want to keep track of number of deletions, d, from L since the last sync. You might do a sync after the following invariant:
is violated. That way sync time remains linear in n and you can always guarantee |T| and d are within manageable limits.
So, to insert a new element e into DL, you first do a find(e) in D for its successor (predecessor) and another search for its successor in T and grab the reference of the bigger successor (smaller predecessor) and use that to do an insert into L and add e to T and maintain references. After inserting e, we check if the Invariant 1 is violated. If so, we trigger a sync.
A sync essentially merges the contents of T and D, while removing elements marked as deleted into a new D structure. This can be done in time linear in |T| + |D| = O(n). In another linear time, references between L and D can be updated. The cost of this mass sync can be distributed (amortized) over the insertions and deletions. For this reason, these costs are only amortized costs.
To do deleteMin, you simply delete the head of L and use its backward link to D to mark its corresponding entry in D as deleted and increment d.
Observation 1: Note that deleteMin will always delete the minimum element since L is always up-to-date.
Observation 2: D is not always in sync with L. It might have some deleted elements (so marked) and some inserted elements to be found only in the T.
By Observation 2, we'll need to schedule a sync of D with L at some point, in order to maintain the O(lg n) find and insert costs. This is done whenever Invariant 1 is violated.
One pesky issue I've sort of glossed over is the following: how do you insert into T in logarithmic time, while still being able to do a linear scan during sync. Only a balanced tree of some sort can achieve this. I'd toyed a bit with the idea of limiting T's size to logarithmic size, but that would increase the amortized cost of the sync when enough number of insertions are done to trigger a sync. It seems that a sort of threaded balanced tree or even a skip list should help out here.
Feel free to critique and suggest improvements as I've not proved or implemented all the assertions here.
Update: As pointed out by @delnan, these costs are amortized. I've updated the description to highlight this fact and revised for clarity. Perhaps, with some trickery the amortization can be removed, but we'll end up with another Galactic structure in that case.
Upvotes: 5
Reputation: 71009
I suggest that you use a skip list as it is the easiest to implement data structure I can think of that supports this operations with the requested complexity. Fibonacci heap will also do the job but it is waaay harder to implement.
EDIT: I take my word back for the Fibonacci heap - it supports constant insert and O(log(n))
delete_min which is the opposite to what you want. Sorry for that.
Upvotes: 4