Python function returns different result with memoization

Question

My Python pathfinding function returns different results if I use a memoizing decorator. It returns a correct value on its own, but after being memoized, it returns an incorrect value.

The function I'm talking about looks like this:

@functools.lru_cache(maxsize=None)
def recursiveTraversal(originIndex, target, steps):
    startingVertex = data[originIndex]

    if startingVertex["ID"] == target:
        if steps == 0:
            return Path(0, [])
        else:
            return None
    else:
        if steps == 0:
            return None
        else:
            potentialPaths = []
            for i in startingVertex["Edges"]:
                nextVertex = data[i]
                nextPath = recursiveTraversal(i, target, steps - 1)
                if nextPath == None:
                    continue
                nextPath.weight += int(nextVertex["Weight"])
                nextPath.vertices.append(i)

                potentialPaths.append(nextPath)
            if len(potentialPaths) > 0:
                minPath = min(potentialPaths, key=lambda x: x.weight)
                return minPath
            else:
                return None

A complete runnable example can be found here. The upper part of the file is all data, and the code is at the bottom. To reproduce this, just comment out line 15, and observe that the output is different.

How can I get the memoized version to output the same thing as the normal version?

a_guest · Accepted Answer

The problem is that you are modifying attributes of the return values of recursiveTraversal. This function returns Path objects and you modify their attributes weight and vertices. So for the non-cached version each time you call the function with (x, y, z) arguments a fresh Path(0, []) object will be created and its attributes are modified later on in the for loop. But for each (x, y, z) call you are assured to start from a fresh object. Now for the cached version, instead of providing a fresh object by going all the way down the recursive tree, the cache wrapper just provides you with the instance of a previously created Path object (which already has modified weight and vertices attributes) and these are modified even further (i.e. this modifies the cache). This can be seen from the following example:

# Augment `Path` class with `__repr__`.
class Path:
    # Other methods go here.

    def __repr__(self):
        return '{}({}, {})'.format(self.__class__.__name__, repr(self.weight), repr(self.vertices))

data = [
    {'ID': '2', 'Weight': 1, 'Edges': [1]},
    {'ID': '1', 'Weight': 1, 'Edges': []}
]

print(recursiveTraversal(0, '1', 1))  # Prints "Path(1, [1])".
print(recursiveTraversal(1, '1', 0))  # Prints "Path(1, [1])".

Checking the function recursiveTraversal is seems that for steps=0 it should return Path(0, []) in case the target matches. Nevertheless it returns Path(1, [1]). This happens because the previous call to recursiveTraversal already invoked recursiveTraversal(1, '1', 0) and modified the result's weight and vertices attributes. When performing the second explicit call to recursiveTraversal(1, '1', 0) you'll get back the cached reference to that object.

Possible solution

A possible solution is to create copies of cached objects before modifying them further. This prevents that the cache gets modified.

from copy import deepcopy

# Within `recursiveTraversal`:
# ...
nextPath = deepcopy(recursiveTraversal(i, target, steps - 1))
# ...

Python function returns different result with memoization

Answers (1)

Possible solution

Related Questions