chang thenoob
chang thenoob

Reputation: 113

Python - Memory allocation error for an array

I would like to process a large database of .tdms files (read file, store value, go to next file, then plots/process values). However, my program keep crashing with always the same error :

MemoryError: Unable to allocate 24.4 GiB for an array with shape (399229, 4096) and data type complex128

As I would like to read hundred of files, process them and plot the result I understand this take a lot of memory, but how could I get rid of this please ? I'm guessing there might be a way to "clean" some stuff that python keep in memory, from time to time in the program ?

Below I paste a minimal exemple showing the way I'm reading the data. For exemple if I try to run this loop for just some files it works fine, but as soon as it's more that some hundred files I have this error after some hours and it stop.

print('start')

sourdir='C:/.../raw_data'

tdms_file = glob.glob('*.tdms')

listTdmsFiles = glob.glob(sourdir + '/*.tdms')
nbTdmsFiles = len(listTdmsFiles)

sorted_by_mtime_ascending = sorted(listTdmsFiles, key=lambda t: os.stat(t).st_mtime)

don=np.arange(nbTdmsFiles)

fichii=[]
Vrot=[]
AXMO=[]
T_OFF=[]
T_ON=[]

for fich in don:
    plt.close('all')
    print(fich)
    dT1=[];dT2=[];dT3=[];dT4=[];dT5=[];dT6=[];dT7=[];dT8=[];tdms_file=[];group=[];data=[]
    if os.path.isfile(sorted_by_mtime_ascending[fich]):

        filename = sorted_by_mtime_ascending[fich].replace(sourdir+"\\", "")
        filename = filename.replace(".tdms", "")
        testnumber = filename.replace("essais", "")        

        tdms_file = TdmsFile(sorted_by_mtime_ascending[fich])
        header = tdms_file.groups()[0].as_dataframe()
        data = tdms_file.groups()[1].as_dataframe()
        tdms_file=[]
        header.columns
        data.columns

        Vrot=np.append(Vrot,header['VitesseMoteur'])
        AXMO=np.append(AXMO,header['IncrementAXMO'])
        T_OFF=np.append(T_OFF,header['T_OFF'])
        T_ON=np.append(T_ON,header['T_ON'])

        plt.close('all')

        fichii=np.append(fichii,testnumber)

#%% visualisation des donées        
print("end")

Thanks

Upvotes: 0

Views: 902

Answers (1)

You can try these:

  • The latest python version has better memory management, so use the latest possible.
  • Make sure you close all files after use
  • Use the del keyword to break any connection to the actual data since python wont free up RAM if there is still a link to the data.
  • Use gc.collect to force garbage collection after using del
  • Use pickle or another serialiser to dump your data in files if you still need it

Upvotes: 2

Related Questions