daniglezad
daniglezad

Reputation: 101

heapq python - how to modify values for which heap is sorted

I transform an empty list called UNVISITED into a heap, such that:

UNVISITED = []
heapq.heappush(UNVISITED, (a.f, a))

The object a that I push, which is instantiated from a class, has the following fields:

class UNVISITEDNode():
    def __init__(self, value1, value2 , value3, value4, value5):
            self.q = value1
            self.h = value2
            self.g = value3
            self.f = value4
            self.p = value5

Throughout my algorithm, I keep modifying any valueX from the object already in the heap whenever is needed like:

for i in range(len(UNVISITED)):
        UNVISITED[i][1].q = newvalue1
        UNVISITED[i][1].h = newvalue2
        UNVISITED[i][1].g = newvalue3
        UNVISITED[i][1].f = newvalue4
        UNVISITED[i][1].p = newvalue5

Because (or so I think, please correct me if I am wrong) modifying the value f like I am doing now does not change the value that affects the sorting of the heap, I directly try to modify UNVISITED[i][0] (which is the above a.f passed as a second part of the second argument when creating the heap).

[THE PROBLEM] -> Then I am told that this value does not admit modification:

UNVISITED[i][0] = newvalue4

*Traceback (most recent call last):
  File "/home/daniel/pycharm-2017.3.3/helpers/pydev/
_pydevd_bundle/pydevd_exec.py", line 3, in Exec
        exec exp in global_vars, local_vars
      File "<input>", line 1, in <module>
    TypeError: 'tuple' object does not support item assignment

I really need to modify the value f of the object a, which has to affect the sorting of the heap every time is needed and you cannot do this through UNVISITED[i][1].f = newvalue4 (apparently). Is there any way to do this or any workaround?

EDIT (WORKAROUND PERFORMED)

Eventually I have defined a simple manual heap as heap = []and heap.append() the objects to it. You can use heap.pop() to pop the first element in the heap, and heap.sort(key=lambda x: x.f, reverse=True) to sort it based on the values of the attributes. Like this you get closer to the behavior of heapq and you are able to modify the elements in the heap for which that heap is sorted. It is important to say that this is significantly slower than using heapq.

Nonetheless, I am marking @Raymong Hettinger 's answer as the good one because of the detail of other possible workarounds.

Also, @Davis Yoshida has made a valid point in that maybe a heap as it is defined might not be the best way to store the data.

Upvotes: 1

Views: 5622

Answers (2)

Raymond Hettinger
Raymond Hettinger

Reputation: 226296

Invalidate and Reinsert

The usual solution is to mark the object as invalid and to reinsert a new value. When popping off values, just ignore the invalid entries.

This technique is very efficient as long as there are not a large number of invalidated entries. The invalidation step runs in constant time and the subsequent pops run in logarithmic time.

Reheapify

After adjusting one or more values, run the heapify() function to restore the heap invariant.

This uses a public function that is guaranteed to run in linear time.

Direct Heap Adjustment

Another way is to locate the object in the heap's list, using list.index(). After changing the value, run the internal _siftup() or _siftdown() functions depending on whether the value is being increased or decreased.

Increasing case:

>>> from heapq import _siftup, _siftdown, heapify, heappop

>>> data = [10, 5, 18, 2, 37, 3, 8, 7, 19, 1]
>>> heapify(data)
>>> old, new = 8, 22              # increase the 8 to 22
>>> i = data.index(old)
>>> data[i] = new
>>> _siftup(data, i)
>>> [heappop(data) for i in range(len(data))]
[1, 2, 3, 5, 7, 10, 18, 19, 22, 37]

Decreasing case:

>>> data = [10, 5, 18, 2, 37, 3, 8, 7, 19, 1]
>>> heapify(data)
>>> old, new = 8, 4              # decrease the 8 to 4
>>> i = data.index(old)
>>> data[i] = new
>>> _siftdown(data, 0, i)
>>> [heappop(data) for i in range(len(data))]
[1, 2, 3, 4, 5, 7, 10, 18, 19, 37]

This technique uses linear time list indexing and a logarithmic time heap update. It is likely to use fewer comparisons than the reheapifying technique, but this isn't entirely satisfying because it uses non-public functions.

Resorting

Lastly, you can resort the data:

>>> data.sort()

This technique likely makes more comparisons than reheapifying or direct heap adjustment. The reason it works is that "if the data is sorted, then it is already a heap".

The running time can be O(n log n) in the worst case; however, the sort implementation applies the Timsort algorithm which can be very efficient with partially sorted inputs.

Upvotes: 10

Davis Yoshida
Davis Yoshida

Reputation: 1785

Even if this code ran, I believe it would not do what you are intending. Specifically, if you did

tup = (a.f, a) # a.f = 7

then could execute

tup[0] = 3

You would have tup set to (3, a), but a.f would still be 7.

One thing you could do would allow direct comparisons by adding an __lt__ (less than) method to your UNVISITEDNode class like so:

class UNVISITEDNode:
   ...
   def __lt__(self, other):
       return self.f < other.f

Then, instead of putting tuples into the heapq, just directly put the node objects in.

However, if you modify objects that are not at the root of the heap, you are no longer guaranteed that the heap is valid so you will need to reheapfiy UNVISITED.

Upvotes: 1

Related Questions