Python Memoization with dict - Performance Issue

Question

I am using a genetic algorithm to solve an optimization problem. I use memoization to speed up computation since fitness evaluation is time consuming. It is implemented in the following way:

def memoize(f):
    memo = {}
    def helper(my_input):
        if my_input not in memo:   
            if len(memo)%100000==0:
                print('increased memo size:', len(memo))

            memo[my_input] = f(my_input)
        return memo[my_input]
    return helper

@memoize
def eval_fitness(individual):
    #time consuming calc
    return fitness

I have noticed that the size of the memo dict increases quickly in the first generations and then increases more slowly (e.g. reaching 14M keys at generation 500).

On the other hand, the elapsed time for each generation is high (i.e. 40s) in the first generations and then decreases, as memoization pays off. Nevertheless, as shown in the image above, i have noticed a non monotonic behavior of the elapsed time data series: the computation slows down and the overall computational time dramatically increases.

I use single process. Memory usage is safely under 20%.

Any ideas about the underlying cause of this behaviour?
How can I avoid this slow-down?

Python Memoization with dict - Performance Issue

Answers (1)

Related Questions