John
John

Reputation: 13699

How is the order of items managed by Python's heapq library determined?

I was under the impression that the first value was what determined a values position in the heap, however that doesn't seem to be the case.

from __future__ import print_function
import heapq

q = []
heapq.heappush(q, (10, 11))
heapq.heappush(q, (11, 12))
heapq.heappush(q, (9, 10))
print(q)

This gives me an output of

[(9, 10), (11, 12), (10, 11)]

However I was expecting an output like

[(9, 10), (10, 11), (11, 12)]

Upvotes: 3

Views: 1724

Answers (1)

MariusSiuram
MariusSiuram

Reputation: 3644

The condition on heapq is not a "sort guarantee" over the provided list. Instead, it guarantees q[k] <= q[2*k+1] and q[k] <= q[2*k+2] (using q as in your example).

This is due that it is managed internally as a binary tree.

If you simply expect to use the sorted list, you can use the heappop as shown here. In your specific example you could:

sorted_q = [heappop(q) for i in range(len(q))

and the result, as you expected, will be:

>>> print sorted_q
[(9, 10), (10, 11), (11, 12)]

The theory is explained here in the docs. Relevant is the following line:

The interesting property of a heap is that a[0] is always its smallest element.

Which is a direct result of the condition q[k] <= q[2*k+1] and q[k] <= q[2*k+2], which is a condition of the heap.

However, there are no further guarantees about the order on the rest of the array. And, in fact, both following trees are valid heaps:

    0
 1     2
2 5   3 4

and

    0
 2     1
5 3   4 2

Which are stored, respectively, as

[0, 1, 2, 2, 5, 3, 4]

and

[0, 2, 1, 5, 3, 4, 2]

Upvotes: 4

Related Questions