Avoid repetitive big arrays computation inside a function in Python

Question

I call repetitively - during a minimisation process - a function that requires big arrays. Here is a dummy example

def foo(N,a):
    big_array = np.mgrid[0:N,0:N]
    b = np.fft.fft2(big_array[0]**a) #some heavy computation
    return b

During the minimisation process, the array size N doesn't change, so I would like to use the same array to avoid useless computation and memory allocation.

Also I would like the function foo to be self-consistent, meaning that I don't want another function to create the array and give it to foo during the minimisation process.

Given these requirements, I was thinking to use a callable object with the array as an attribute. What do you think about this? Is there a more pythonic way to do?

Jean-Fran&#231;ois Fabre · Accepted Answer

a self contained approach (without global variable) would be to use a mutable default argument (that you shouldn't call your function with) to memoize the previously allocated arrays given their size

if the array size isn't in the dictionary, create it, and add it.

def foo(N,a,dict_container={}):
    if N in dict_container:
        big_array = dict_container[N]
    else:
        big_array = np.mgrid[0:N,0:N]
        dict_container[N] = big_array

    b = np.fft.fft2(big_array[0]**a) #some heavy computation
    return b

The main problem with this approach is that it disables the garbage collector for this array, so if N changes too much, you can have memory exhaustion. Same technique, but using a LRU cache can solve the issue:

from functools import lru_cache
@lru_cache(maxsize=32)  # max 32 elements in cache
def get_matrix(N):
    return np.mgrid[0:N,0:N]

def foo(N,a):
    big_array = get_matrix(N)
    b = np.fft.fft2(big_array[0]**a) #some heavy computation
    return b

(don't define get_matrix inside foo or the cache is going to be reinitialized at each call)

Avoid repetitive big arrays computation inside a function in Python

Answers (1)

Related Questions