enaJ
enaJ

Reputation: 1655

Python - hash heap implementation

I have a stream data coming, and I am maintaining them by pushing them one by one into a heap (priority queue), the resulting heap looks like:

[(a,1), (b,2), (c, 7), (d, 2), ...]

Because I need to update the items (e.g., change (a,1) to (a, 2), or delete (c,7)) continously through out the time. To effciently find and remove an items in a heap, I want to construct a hash table with the location of every items in the heap stored in the hash table.

So that each time I want to update an item, I can use the hashtable to find it and make changes easily in the heap, simutinouly updating the position of every item in the hash table.

The same question has been asked in this post: How to implement O(1) deletion on min-heap with hashtable with c++ code as following:

template<typename state, typename CmpKey, class dataStructure>
bool AStarOpenClosed<state, CmpKey, dataStructure>::HeapifyUp(unsigned int index)
{
        if (index == 0) return false;
        int parent = (index-1)/2;
        CmpKey compare;

        if (compare(elements[theHeap[parent]], elements[theHeap[index]]))
        {
                // Perform normal heap operations
                unsigned int tmp = theHeap[parent];
                theHeap[parent] = theHeap[index];
                theHeap[index] = tmp;
                // Update the element location in the hash table
                elements[theHeap[parent]].openLocation = parent;
                elements[theHeap[index]].openLocation = index;
                HeapifyUp(parent);
                return true;
        }
        return false;
}

I have little experience with c++, wondering if anyone can help me explain the idea or provide a python version code of such an implementation?

Upvotes: 3

Views: 2781

Answers (1)

Leon
Leon

Reputation: 32544

My understanding is that the first item in your pair serves as the key, and the second item as the data payload. Then I would propose an approach the other way around, somewhat similar to this answer, but simpler.

  1. Let the hashtable be your primary data structure for data storage and the min-heap be an auxiliary data structure for maintaining the current smallest key in your data set.

  2. Insert new item: add the data into both hashtable and min-heap.

  3. Update the value for the given key: update the value in the hashtable only.

  4. Delete the item with the given key: delete the entry with the given key from the hashtable only.

  5. Access the smallest key: if the element at the top of the heap is not found in the hashtable, drop it; repeat, until the top key is present in the hashtable.

Upvotes: 2

Related Questions