Reputation: 1631

Batch updating node priorities in a binary heap?

I posted quite confusing question, so I rewrote it from scratch...

This is actually purely theoretical question.

Say, we have binary heap. Let the heap be a MaxHeap, so root node has the biggest value and every node has bigger value than it's children. We can do some common low-level operations on this heap: "Swap two nodes", "compare two nodes".

Using those low-level operation, we can implement usual higher level recursive operations: "sift-up", "sift-down".

Using those sift-up and sift-downs, we can implement "insert", "repair" and "update". I am interested in the "update" function. Let's assume that I already have the position of the node to be changed. Therefore, update function is very simple:

function update (node_position, new_value){
    heap[node_position] = new_value;
    sift_up(node_position);
    sift_down(node_position);
}

My question is: Is it (mathematicaly) possible, to make more advanced "update" function, that could update more nodes at once, in a way, that all nodes change their values to new_values, and after that, their position is corrected? Something like this:

function double_update (node1_pos, node2_pos, node1_newVal, node2_newVal){
    heap[node1_pos] = node1_newVal;
    heap[node2_pos] = node2_newVal;
    sift_up(node1_position);
    sift_down(node1_position);
    sift_up(node2_position);
    sift_down(node2_position);
}

I did some tests this with this "double_update" and it worked, although it doesn't prove anything.

What about "triple updates", and so on...

I did some other tests with "multi updates", where I changed values of all nodes and then called { sift-up(); sift-down(); } once for each of them in random order. This didn't work, but the result wasn't far from correct.

I know this doesn't sound useful, but I am interested in the theory behind it. And if I make it work, I actually do have one use for it.

Upvotes: 1

Answers (2)

rici

Reputation: 241791

Here's a trick and possibly funky algorithm, for some definition of funky:

(Lots of stuff left out, just to give the idea):

template<typename T> class pseudoHeap {
  private:
    using iterator = typename vector<T>::iterator;
    iterator max_node;
    vector<T> heap;
    bool heapified;

    void find_max() {
      max_node = std::max_element(heap.begin(), heap.end());
    }

  public:
    void update(iterator node, T new_val) {
      if (node == max_node) {
          if (new_val < *max_node) {
               heapified = false;
               *max_node = new_val;
               find_max();
           } else {
               *max_node = new_val;
           }
      } else {
          if (new_val > *max_node) max_node = new_val;
          *node = new_val;
          heapified = false;
     }

    T& front() { return &*max_node; }

    void pop_front() {
        if (!heapified) {
            std::iter_swap(vector.end() - 1, max_node);
            std::make_heap(vector.begin(), vector.end() - 1);
            heapified = true;
        } else {
            std::pop_heap(vector.begin(), vector.end());
        }
    }        
};

Keeping a heap is expensive. If you do n updates before you start popping the heap, you've done the same amount of work as just sorting the vector when you need it to be sorted (O(n log n)). If it's useful to know what the maximum value is all the time, then there is some reason to keep a heap, but if the maximum value is no more likely to be modified than any other value, then you can keep the maximum value always handy at amortized cost O(1) (that is, 1/n times it costs O(n) and the rest of the time it's O(1). That's what the above code does, but it might be even better to be lazy about computing the max as well, making front() amortized O(1) instead of constant O(1). Depends on your requirements.

As yet another alternative, if the modifications normally don't cause the values to move very far, just do a simple "find the new home and rotate the subvector" loop, which although it's O(n) instead of O(log n), is still faster on short moves because the constant is smaller.

In other words, don't use priority heaps unless you're constantly required to find the top k values. When there are lots of modifications between reads, there is usually a better approach.

Upvotes: 1

templatetypedef

Reputation: 372814

It's definitely possible to do this, but if you're planning on changing a large number of keys in a binary heap, you might want to look at other heap structures like the Fibonacci heap or the pairing heap which can do this much faster than the binary heap. Changing k keys in a binary heap with n nodes takes O(k log n) time, while in a Fibonacci heap it takes time O(k). This is asymptotically optimal, since you can't even touch k nodes without doing at least Ω(k) work.

Another thing to consider is that if you change more than Ω(n / log n) keys at once, you are going to do at least Ω(n) work. In that case, it's probably faster to implement updates by just rebuilding the heap from scratch in Θ(n) time using the standard heapify algorithm.

Hope this helps!

Upvotes: 1

Batch updating node priorities in a binary heap?

Answers (2)

Related Questions