namedunframed
namedunframed

Reputation: 31

Optimization of time average msd

I am tried to perform a simulation of nT = 100 tracks, each of N = 10*10**6 steps of dt = 0.02 and then compute the time average MSD defined in the following:

def calc_msd_1D(x, nLags):
  
    
    N = len(x)
    inv_sq_np = 1./np.sqrt(N)
    msd = np.zeros(nLags)

    
    for delta in range(0, nLags):
        r = 0;
    
        #msd_array = np.zeros(N)
        for i in range(N-(delta)):
            r += (x[i+delta] - x[i])**2
        msd[delta] = 1/(N-delta) * r
    
    
   # msd[0] -= 2*np.random.normal(0,1)**2   
    #msd[1:] += 2*np.random.normal(0,1)**2   
        


    return msd

The MSD are hence computed for each trajectories using a class type of structure:

class Trajectory_Analysis_MSD: 
    
    def __init__(self,X, Y, nP, dT):
       
        
        # save parameters
        self.dT = dT
        self.X = X
        self.Y = Y
        self.nP = nP
        
  

    
    def getMSD(self,nLags): 
    
        # initialize memory
        self.MSD_x = np.zeros(nLags)
        self.MSD_y = np.zeros(nLags)
        
            
        # calculate the correlations for components
        self.MSD_x = calc_msd_1D(self.X, nLags)
        self.MSD_y= calc_msd_1D(self.Y, nLags)
            
        # calculate the msd
        self.msd= (self.MSD_x + self.MSD_y)

Unfortunately the computation is very expensive in time, at the moment I have to sample the trajectories up to 50000 points to store the msd (average time \approx 26 min). Is there a way I can compute the time average msd for the entire track of each trajectory? Probably without saving each data points?

Upvotes: 0

Views: 192

Answers (1)

maxy
maxy

Reputation: 5457

This for loop:

r = 0
for i in range(N-delta):
    r += (x[i+delta] - x[i])**2

Will be very slow for large N because it iterates in pure Python. I'm guessing this (or similar code you have elsewhere) is your bottleneck.

Try to vectorize your code, so all the inner loops run inside numpy, not Python:

r = np.sum((x[delta:N] - x[0:N-delta])**2)

You don't even need the N variable:

msd[delta] = np.mean((x[delta:] - x[:-delta])**2)

And maybe you can use a ready-made function like np.correlate in this case.

Upvotes: 1

Related Questions