Julien Rousé
Julien Rousé

Reputation: 1113

Heapq module implementation

I was reading the heapq module source because I reviewed a question on CodeReview and I cannot understand something.

In the wikipedia article about heap it says:

sift-up: move a node up in the tree, as long as needed; used to restore heap condition after insertion. Called "sift" because node moves up the tree until it reaches the correct level, as in a sieve.

 sift-down: move a node down in the tree, similar to sift-up; used to restore heap condition after deletion or replacement.

But the code of heappush (source code) is :

def heappush(heap, item):
    """Push item onto heap, maintaining the heap invariant."""
    heap.append(item)
    _siftdown(heap, 0, len(heap)-1)

If I read wikipedia right, when inserting an element I was expecting to see a siftup call, not a siftdown one.

Similarly for heappop (source here):

def heappop(heap):
    """Pop the smallest item off the heap, maintaining the heap invariant."""
    lastelt = heap.pop()    # raises appropriate IndexError if heap is empty
    if heap:
        returnitem = heap[0]
        heap[0] = lastelt
        _siftup(heap, 0)
        return returnitem
return lastelt

From the wikipedia article I was expecting a siftdown call but got a siftup one.

Is it a mistake in Wikipedia or on the heapq module? Or is my understand wrong?

Upvotes: 2

Views: 452

Answers (1)

Jim Mischel
Jim Mischel

Reputation: 133950

As noted in comments, it's a nomenclature issue. The most common terminology calls the root the "top" of the tree, and nodes at other levels are "below" the root. We draw the tree in that orientation. That is:

        1
    2       3
  4   5   6   7

It makes sense, then, to say that to move an item from the root to a lower level is "sifting down."

You could make the argument, as somebody did in comments, that moving something to a lower level is increasing its index in the backing array, so it makes sense to speak of that as "sifting up". But people are visualizing the tree model, not the array implementation. When speaking of the model, your terminology should be consistent with the model.

I've always found it a bit annoying that the author of heapq decided to use the non-standard terminology. One could argue that he's talking about the implementation, but I dispute that. The comment says, "sift-up: move a node up in the tree ..." Clearly, he's referring to the tree model.

Wikipedia, https://en.wikipedia.org/wiki/Tree_structure, says:

A tree structure or tree diagram is a way of representing the hierarchical nature of a structure in a graphical form. It is named a "tree structure" because the classic representation resembles a tree, even though the chart is generally upside down compared to an actual tree, with the "root" at the top and the "leaves" at the bottom.

This topic was discussed to death in the early days, perhaps most famously by Donald Knuth in The Art of Computer Programming. See https://www.quora.com/Why-are-trees-in-computer-science-generally-drawn-upside-down-from-how-trees-are-in-real-life.

Upvotes: 2

Related Questions