save simulation data in python

Question

I am doing a molecular dynamics simulation in Python with a large number of particles. Here I have to keep track of x-position, y-position, z-position, x-velocity, y-velocity, z-velocity, x-acceleration, y-acceleration,z-acceleration, x-force, y-force, z-force, potential for all 500 or 1000 particles along with the kinetic energy, potential energy, total energy of the system at every time interval. To save these data I am presently writing them to a file in the following way:

from numpy import*
...
f=open('myMD.dat','w')
while t<=tmax:
    s='%3.3f		'%(t)  # writing the time to the file
    f.write(s)
    for i in range(TotalNumberofParticles):
        UpdateParameters()
        s='%8.3e  %8.3e  %8.3e  %8.3e  %8.3e  %8.3e  %8.3e  %8.3e  %8.3e  %8.3e  %8.3e  %8.3e  %8.3e		'%(xpos1[i],ypos1[i],zpos1[i],xvel1[i],yvel1[i],zvel1[i],xacc1[i],yacc1[i],zacc1[i],xforc[i],yforc[i],zforc[i],potn[i])
        f.write(s)
        ...
    s='%8.3e  %8.3e  %8.3e
'%(KE,PE,TE)    
    f.write(s)
    t=t+delt
f.close()

So if there are 500 particles, each line in the data file will have 7000 columns. It somehow seems a bad way of saving data. Moreover the simulation goes on for several thousand time steps. As the program proceeds it gets slower as well. Can someone give me an idea about how I can save my data in a better way? Also I would like to know how I can retrieve those data for further analysis.

Roland Smith · Accepted Answer

Instead of writing each xpos[i] separately in a loop, write the whole array in one go with numpy.savez;

import numpy as np

np.savez('myMD.dat', 
         xpos1=xpos1, ypos1=ypos1, zpos1=zpos1,
         xvel1=xvel1, yvel1=yvel1, zvel1=zvel1,
         xacc1=xacc1, yacc1=yacc1, zacc1=zacc1,
         xforc=xforc, yforc=yforc, zforc=zforc,
         potn=potn)

Using keyword arguments like this ensures that the arrays are saved with their proper name.

Since you want to write date for each particle and each iteration, I propose that you add an extra axis to the arrays to contain all iterations.

So instead of

xpos1 = np.array(TotalNumberofParticles)

do

xpos1 = np.array((number_of_iterations, TotalNumberofParticles))

You can use numpy.load to read the data again:

import numpy as np

data = numpy.load('myMD.dat')
xpos1 = data['xpos1']
# et cetera

BTW, from PEP 8 (the Python style guide):

Wildcard imports (from import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools.

save simulation data in python

Answers (2)

Related Questions