Reputation: 31
I am tried to perform a simulation of nT = 100 tracks, each of N = 10*10**6 steps of dt = 0.02 and then compute the time average MSD defined in the following:
def calc_msd_1D(x, nLags):
N = len(x)
inv_sq_np = 1./np.sqrt(N)
msd = np.zeros(nLags)
for delta in range(0, nLags):
r = 0;
#msd_array = np.zeros(N)
for i in range(N-(delta)):
r += (x[i+delta] - x[i])**2
msd[delta] = 1/(N-delta) * r
# msd[0] -= 2*np.random.normal(0,1)**2
#msd[1:] += 2*np.random.normal(0,1)**2
return msd
The MSD are hence computed for each trajectories using a class type of structure:
class Trajectory_Analysis_MSD:
def __init__(self,X, Y, nP, dT):
# save parameters
self.dT = dT
self.X = X
self.Y = Y
self.nP = nP
def getMSD(self,nLags):
# initialize memory
self.MSD_x = np.zeros(nLags)
self.MSD_y = np.zeros(nLags)
# calculate the correlations for components
self.MSD_x = calc_msd_1D(self.X, nLags)
self.MSD_y= calc_msd_1D(self.Y, nLags)
# calculate the msd
self.msd= (self.MSD_x + self.MSD_y)
Unfortunately the computation is very expensive in time, at the moment I have to sample the trajectories up to 50000 points to store the msd (average time \approx 26 min). Is there a way I can compute the time average msd for the entire track of each trajectory? Probably without saving each data points?
Upvotes: 0
Views: 192
Reputation: 5457
This for
loop:
r = 0
for i in range(N-delta):
r += (x[i+delta] - x[i])**2
Will be very slow for large N because it iterates in pure Python. I'm guessing this (or similar code you have elsewhere) is your bottleneck.
Try to vectorize your code, so all the inner loops run inside numpy, not Python:
r = np.sum((x[delta:N] - x[0:N-delta])**2)
You don't even need the N
variable:
msd[delta] = np.mean((x[delta:] - x[:-delta])**2)
And maybe you can use a ready-made function like np.correlate in this case.
Upvotes: 1