Loc-Tran
Loc-Tran

Reputation: 160

How to find input that gives a specific max heap structure

I understand how heaps work but there is a problem I have no idea on how to solve.

Let's say you're given a max heap (not a BST),

[149 , 130 , 129 , 107 , 122 , 124 , 103 , 66 , 77 , 91 , 98 , 10 , 55 , 35 , 72]

Find a list of inputs that would give you the same heap structure such that each successive value would be the largest it can possibly which would be:

[66 , 91 , 10 , 107 , 122 , 35 , 55 , 77 , 130 , 98 , 149 , 124 , 129 , 72 , 103]

So in other words, if you were going to insert 66 first then 91 then 10 then 107 and so on into an empty max heap, you would end up with the given heap structure after all of the bubbling up and so forth. How would you even find this input in the first place?

Can anyone suggest any ideas?

Thanks

Upvotes: 1

Views: 158

Answers (1)

Paul Hankin
Paul Hankin

Reputation: 58221

Consider this max-heap (which I'll draw as a tree, but represents [7, 6, 5, 4, 3, 1, 2].

    7
 6     5
4 3   1 2

What's the last element that can be inserted? The last slot filled in the heap must be the bottom-right of the tree, and the bubbling-up procedure can only have touched elements along the route from that node to the top. So the previous element inserted must be 7, 5 or 2. Not all of these are possible. If it was 7, then the tree must have looked like this before insertion (with _ representing the slot where we're going to insert before bubbling up):

    5
 6     2
4 3   1 _

which violates the heap constraint. If 5 were the last element to be inserted, then the heap would have looked like this:

    7
 6     2
4 3   1 _

This works, so 5 could have been the last thing inserted. Similarly, 2 could also have been the last thing inserted.

In general, an element along the path to the bottom-right-most node could have been the last thing inserted if all the nodes below it along the path are at least as large as the other child of its parent. In our example: 7 can't be the last thing inserted because 5 < 6. 5 can be the last thing inserted because 2 > 1. 2 can be the last thing inserted because it doesn't have any children.

With this observation, one can generate all input sequences (in reverse order) that could have resulted in the heap by recursion.

Here's some code that runs on the example you gave, and verifies that each input sequence it generates actually does generate the given heap. There's 226696 different inputs, but the program only takes a few seconds to run.

# children returns the two children of i. The first
# is along the path to n.
# For example: children(1, 4) == 4, 3
def children(i, n):
    i += 1
    n += 1
    b = 0
    while n > i:
        b = n & 1
        n //= 2
    return 2 * i + b - 1, 2 * i - b

# try_remove tries to remove the element i from the heap, on
# the assumption is what the last thing inserted.
# It returns a new heap without that element if possible,
# and otherwise None.
def try_remove(h, i):
    h2 = h[:-1]
    n = len(h) - 1
    while i < n:
        c1, c2 = children(i, n)
        h2[i] = h[c1]
        if c2 < len(h) and h[c1] < h[c2]:
            return None
        i = c1
    return h2

# inputs generates all possible input sequences that could have
# generated the given heap.
def inputs(h):
    if len(h) <= 1:
        yield h
        return
    n = len(h) - 1
    while True:
        h2 = try_remove(h, n)
        if h2 is not None:
            for ins in inputs(h2):
                yield ins + [h[n]]
        if n == 0: break
        n = (n - 1) // 2

import heapq

# assert_inputs_give_heap builds a max-heap from the
# given inputs, and asserts it's equal to cs.
def assert_inputs_give_heap(ins, cs):
    # Build a heap from the inputs.
    # Python heaps are min-heaps, so we negate the items to emulate a max heap.
    h = []
    for i in ins:
        heapq.heappush(h, -i)
    h = [-x for x in h]
    if h != cs:
        raise AssertionError('%s != %s' % (h, cs))

cs = [149, 130, 129, 107, 122, 124, 103, 66, 77, 91, 98, 10, 55, 35, 72]

for ins in inputs(cs):
    assert_inputs_give_heap(ins, cs)
    print ins

Upvotes: 2

Related Questions