priority queue with limited space: looking for a good algorithm

SigTerm picture SigTerm · May 29, 2010 · Viewed 7.2k times · Source

This is not a homework.

I'm using a small "priority queue" (implemented as array at the moment) for storing last N items with smallest value. This is a bit slow - O(N) item insertion time. Current implementation keeps track of largest item in array and discards any items that wouldn't fit into array, but I still would like to reduce number of operations further.

looking for a priority queue algorithm that matches following requirements:

  1. queue can be implemented as array, which has fixed size and _cannot_ grow. Dynamic memory allocation during any queue operation is strictly forbidden.
  2. Anything that doesn't fit into array is discarded, but queue keeps all smallest elements ever encountered.
  3. O(log(N)) insertion time (i.e. adding element into queue should take up to O(log(N))).
  4. (optional) O(1) access for *largest* item in queue (queue stores *smallest* items, so the largest item will be discarded first and I'll need them to reduce number of operations)
  5. Easy to implement/understand. Ideally - something similar to binary search - once you understand it, you remember it forever.
  6. Elements need not to be sorted in any way. I just need to keep N smallest value ever encountered. When I'll need them, I'll access all of them at once. So technically it doesn't have to be a queue, I just need N last smallest values to be stored.

I initially thought about using binary heaps (they can be easily implemented via arrays), but apparently they don't behave well when array can't grow anymore. Linked lists and arrays will require extra time for moving things around. stl priority queue grows and uses dynamic allocation (I may be wrong about it, though).

So, any other ideas?

--EDIT--
I'm not interested in STL implementation. STL implementation (suggested by a few people) works a bit slower than currently used linear array due to high number of function calls.

I'm interested in priority queue algorithms, not implemnetations.

Answer

Aryabhatta picture Aryabhatta · May 29, 2010

Array based heaps seem ideal for your purpose. I am not sure why you rejected them.

You use a max-heap.

Say you have an N element heap (implemented as an array) which contains the N smallest elements seen so far.

When an element comes in you check against the max (O(1) time), and reject if it is greater.

If the value coming in is lower, you modify the root to be the new value and sift-down this changed value - worst case O(log N) time.

The sift-down process is simple: Starting at root, at each step you exchange this value with it's larger child until the max-heap property is restored.

So, you will not have to do any deletes which you probably will have to, if you use std::priority_queue. Depending on the implementation of std::priority_queue, this could cause memory allocation/deallocation.

So you can have the code as follows:

  • Allocated Array of size N.
  • Fill it up with the first N elements you see.
  • heapify (you should find this in standard text books, it uses sift-down). This is O(N).
  • Now any new element you get, you either reject it in O(1) time or insert by sifting-down in worst case O(logN) time.

On an average, though, you probably will not have to sift-down the new value all the way down and might get better than O(logn) average insert time (though I haven't tried proving it).

You only allocate size N array once and any insertion is done by exchanging elements of the array, so there is no dynamic memory allocation after that.

Check out the wiki page which has pseudo code for heapify and sift-down: http://en.wikipedia.org/wiki/Heapsort