SigTerm
SigTerm

Reputation: 26419

priority queue with limited space: looking for a good algorithm

This is not a homework.

I'm using a small "priority queue" (implemented as array at the moment) for storing last N items with smallest value. This is a bit slow - O(N) item insertion time. Current implementation keeps track of largest item in array and discards any items that wouldn't fit into array, but I still would like to reduce number of operations further.

looking for a priority queue algorithm that matches following requirements:

  1. queue can be implemented as array, which has fixed size and _cannot_ grow. Dynamic memory allocation during any queue operation is strictly forbidden.
  2. Anything that doesn't fit into array is discarded, but queue keeps all smallest elements ever encountered.
  3. O(log(N)) insertion time (i.e. adding element into queue should take up to O(log(N))).
  4. (optional) O(1) access for *largest* item in queue (queue stores *smallest* items, so the largest item will be discarded first and I'll need them to reduce number of operations)
  5. Easy to implement/understand. Ideally - something similar to binary search - once you understand it, you remember it forever.
  6. Elements need not to be sorted in any way. I just need to keep N smallest value ever encountered. When I'll need them, I'll access all of them at once. So technically it doesn't have to be a queue, I just need N last smallest values to be stored.

I initially thought about using binary heaps (they can be easily implemented via arrays), but apparently they don't behave well when array can't grow anymore. Linked lists and arrays will require extra time for moving things around. stl priority queue grows and uses dynamic allocation (I may be wrong about it, though).

So, any other ideas?

--EDIT--
I'm not interested in STL implementation. STL implementation (suggested by a few people) works a bit slower than currently used linear array due to high number of function calls.

I'm interested in priority queue algorithms, not implemnetations.

Upvotes: 13

Views: 10293

Answers (8)

ami
ami

Reputation: 31

It's better to implement your own class using std::array and heap algorithms.

  `template<class T, int fixed_size = 5>
   class fixed_size_arr_pqueue_v2
   {
     std::array<T, fixed_size> _data;
     int _size = 0;

     int parent(int i)
     {
       return (i - 1)/2;
     }

     void heapify(int i, bool downward = false)
     {
       int l = 2*i + 1;
       int r = 2*i + 2;
       int largest = 0;
       if (l < size() && _data[l] > _data[i])
        largest = l;
       else
        largest = i;

       if (r < size() && _data[r] > _data[largest])
        largest = r;   

       if (largest != i)
       {
         std::swap(_data[largest], _data[i]);
         if (!downward)
           heapify(parent(i));
         else
           heapify(largest, true);
       }
    }

    public:

    void push(T &d)
    {
       if (_size == fixed_size)
       {
          //min elements in a max heap lies at leaves only. 
          auto minItr = std::min_element(begin(_data) + _size/2, end(_data));
          auto minPos {minItr - _data.begin()};
          auto min { *minItr};

          if (d > min)
          {
             _data.at(minPos) = d;
             if (_data[parent(minPos)] > d)
             { 
                //this is unlikely to happen in our case? as this position is  a leaf?
                heapify(minPos, true);
             }  
             else
                heapify(parent(minPos));
           }           

          return ;
       }

       _data.at(_size++) = d;
       std::push_heap(_data.begin(), _data.begin() + _size);
    }

    T pop()
    {
       T d = _data.front();
       std::pop_heap(_data.begin(), _data.begin() + _size);
       _size--;
       return d;
    }

    T top()
    {
       return _data.front();
    }

    int size() const
    {
       return _size;
    }
 };`

Upvotes: 0

SigTerm
SigTerm

Reputation: 26419

Found a solution ("difference" means "priority" in the code, and maxRememberedResults is 255 (could be any (2^n - 1)):

template <typename T> inline void swap(T& a, T& b){
    T c = a;
    a = b;
    b = c;
}


struct MinDifferenceArray{
    enum{maxSize = maxRememberedResults};
    int size;
    DifferenceData data[maxSize];
    void add(const DifferenceData& val){
        if (size >= maxSize){
            if(data[0].difference <= val.difference)
                return;

            data[0] = val;

            for (int i = 0; (2*i+1) < maxSize; ){
                int next = 2*i + 1;
                if (data[next].difference < data[next+1].difference)
                    next++;
                if (data[i].difference < data[next].difference)
                    swap(data[i], data[next]);
                else
                    break;
                i = next;
            }
        }
        else{
            data[size++] = val;
            for (int i = size - 1; i > 0;){
                int parent = (i-1)/2;
                if (data[parent].difference < data[i].difference){
                    swap(data[parent], data[i]);
                    i = parent;
                }
                else
                    break;
            }
        }
    }

    void clear(){
        size = 0;
    }

    MinDifferenceArray()
        :size(0){
    }
};
  1. build max-based queue (root is largest)
  2. until it is full, fill up normally
  3. when it is full, for every new element
    1. Check if new element is smaller than root.
    2. if it is larger or equal than root, reject.
    3. otherwise, replace root with new element and perform normal heap "sift-down".

And we get O(log(N)) insert as a worst case scenario.

It is the same solution as the one provided by user with nickname "Moron". Thanks to everyone for replies.

P.S. Apparently programming without sleeping enough wasn't a good idea.

Upvotes: 0

Aryabhatta
Aryabhatta

Reputation:

Array based heaps seem ideal for your purpose. I am not sure why you rejected them.

You use a max-heap.

Say you have an N element heap (implemented as an array) which contains the N smallest elements seen so far.

When an element comes in you check against the max (O(1) time), and reject if it is greater.

If the value coming in is lower, you modify the root to be the new value and sift-down this changed value - worst case O(log N) time.

The sift-down process is simple: Starting at root, at each step you exchange this value with it's larger child until the max-heap property is restored.

So, you will not have to do any deletes which you probably will have to, if you use std::priority_queue. Depending on the implementation of std::priority_queue, this could cause memory allocation/deallocation.

So you can have the code as follows:

  • Allocated Array of size N.
  • Fill it up with the first N elements you see.
  • heapify (you should find this in standard text books, it uses sift-down). This is O(N).
  • Now any new element you get, you either reject it in O(1) time or insert by sifting-down in worst case O(logN) time.

On an average, though, you probably will not have to sift-down the new value all the way down and might get better than O(logn) average insert time (though I haven't tried proving it).

You only allocate size N array once and any insertion is done by exchanging elements of the array, so there is no dynamic memory allocation after that.

Check out the wiki page which has pseudo code for heapify and sift-down: http://en.wikipedia.org/wiki/Heapsort

Upvotes: 19

helpermethod
helpermethod

Reputation: 62185

Matters Computational see page 158. The implementation itself is quite well, and you can even tweak it a little without making it less readable. For example, when you compute the left child like:

int left = i / 2;

You can compute the rightchild like so:

int right = left + 1;

Upvotes: 0

Sparky
Sparky

Reputation: 14057

Most priority queues I work are based on linked lists. If you have a pre-determined number of priority levels, you can easily create a priority queue with O(1) insertion by having an array of linked lists--one linked list per priority level. Items of the same priority will of course degenerate into either a FIFO, but that can be considered acceptable.

Adding and removal then becomes something like (your API may vary) ...

listItemAdd (&list[priLevel], &item);      /* Add to tail */
pItem = listItemRemove (&list[priLevel]);  /* Remove from head */

Getting the first item in the queue then becomes a problem of finding the non-empty linked-list with the highest priority. That may be O(N), but there are several tricks you can use to speed it up.

  1. In your priority queue structure, keep a pointer or index or something to the linked list with the current highest priority. This would need to be updated each time an item is added or removed from the priority queue.
  2. Use a bitmap to indicate which linked lists are not empty. Combined with a find most significant bit, or find least significant bit algorithm you can usually test up to 32 lists at once. Again, this would need to be updated on each add / remove.

Hope this helps.

Upvotes: 1

Chris
Chris

Reputation: 1333

If you construct an STL priority queue at the maximum size (perhaps from a vector initialized with placeholders), and then check the size before inserting (removing an item if necessary beforehand) you'll never have dynamic allocation during insert operations. The STL implementation is quite efficient.

Upvotes: 0

ony
ony

Reputation: 13223

If amount of priorities is small and fixed than you can use ring-buffer for each priority. That will lead to waste of the space if objects is big, but if their size is comparable with pointer/index than variants with storing additional pointers in objects may increase size of array in the same way.
Or you can use simple single-linked list inside array and store 2*M+1 pointers/indexes, one will point to first free node and other pairs will point to head and tail of each priority. In that cases you'll have to compare in avg. O(M) before taking out next node with O(1). And insertion will take O(1).

Upvotes: 0

Marcelo Cantos
Marcelo Cantos

Reputation: 185862

Use std::priority_queue with the largest item at the head. For each new item, discard it if it is >= the head item, otherwise pop the head item and insert the new item.

Side note: Standard containers will only grow if you make them grow. As long as you remove one item before inserting a new item (after it reaches its maximum size, of course), this won't happen.

Upvotes: 8

Related Questions