Prokop Hapala
Prokop Hapala

Reputation: 2444

Numpy and matplotlib garbage collection

I have a python script which does many simulations for different parameters ( Q, K ), plots results and stores it to disk.

Each set of parameters ( Q,K ) produces a 3D volumetric grid of data 200x200x80 datapoints, which requires ~100 MB of data. A part of this volumetric grid is then plot, layer by layer, producing ~60 images.

The problem is that python obviously does not release memory during this process. I'm not sure where the memory leak is, or what the rules are governing how python decides which objects are deallocated. I'm also not sure if the memory is lost in numpy arrays or in matplotlib figure objects.

  1. Is there a simple way to analyze which objects in python persist in memory and which were automatically deallocated?
  2. Is there a way to force python to deallocate all arrays and figure objects which were created in particular loop cycle or in particular function call?

The relevant part of code is here ( however, it will not run ... the bigger part of the simulation code including ctypes C++/python interface is omitted because it is too complicated ):

import numpy as np
import matplotlib.pyplot as plt
import ProbeParticle as PP # this is my C++/Python simulation library, take it as blackbox

def relaxedScan3D( xTips, yTips, zTips ):
    ntips = len(zTips); 
    print " zTips : ",zTips
    rTips = np.zeros((ntips,3)) # is this array deallocated when exiting the function?
    rs    = np.zeros((ntips,3)) # and this?
    fs    = np.zeros((ntips,3)) # and this?
    rTips[:,0] = 1.0
    rTips[:,1] = 1.0
    rTips[:,2] = zTips 
    fzs    = np.zeros(( len(zTips), len(yTips ), len(xTips ) )); # and this?
    for ix,x in enumerate( xTips  ):
        print "relax ix:", ix
        rTips[:,0] = x
        for iy,y in enumerate( yTips  ):
            rTips[:,1] = y
            itrav = PP.relaxTipStroke( rTips, rs, fs ) / float( len(zTips) )
            fzs[:,iy,ix] = fs[:,2].copy()
    return fzs


def plotImages( prefix, F, slices ):
    for ii,i in enumerate(slices):
        print " plotting ", i
        plt.figure( figsize=( 10,10 ) ) # Is this figure deallocated when exiting the function ?
        plt.imshow( F[i], origin='image', interpolation=PP.params['imageInterpolation'], cmap=PP.params['colorscale'], extent=extent )
        z = zTips[i] - PP.params['moleculeShift' ][2]
        plt.colorbar();
        plt.xlabel(r' Tip_x $\AA$')
        plt.ylabel(r' Tip_y $\AA$')
        plt.title( r"Tip_z = %2.2f $\AA$" %z  )
        plt.savefig( prefix+'_%3.3i.png' %i, bbox_inches='tight' )

Ks = [ 0.125, 0.25, 0.5, 1.0 ]
Qs = [ -0.4, -0.3, -0.2, -0.1, 0.0, +0.1, +0.2, +0.3, +0.4 ]

for iq,Q in enumerate( Qs ):
    FF = FFLJ + FFel * Q
    PP.setFF_Pointer( FF )
    for ik,K in enumerate( Ks ):
        dirname = "Q%1.2fK%1.2f" %(Q,K)
        os.makedirs( dirname )
        PP.setTip( kSpring = np.array((K,K,0.0))/-PP.eVA_Nm )
        fzs = relaxedScan3D( xTips, yTips, zTips ) # is memory of "fzs" recycled or does it consume more memory each cycle of the loop ?
        PP.saveXSF( dirname+'/OutFz.xsf', headScan, lvecScan, fzs )
        dfs = PP.Fz2df( fzs, dz = dz, k0 = PP.params['kCantilever'], f0=PP.params['f0Cantilever'], n=int(PP.params['Amplitude']/dz) ) # is memory of "dfs" recycled?
        plotImages( dirname+"/df", dfs, slices = range( 0, len(dfs) ) )

Upvotes: 8

Views: 2586

Answers (1)

tillsten
tillsten

Reputation: 14878

Try to reuse your figure:

plt.figure(0, figsize=(10, 10))
plt.clf() #clears figure

or close your figure after saving:

...
plt.savefig(...)
plt.close()

Upvotes: 11

Related Questions