Read .dat file using python

I've got a simulation involving internal waves and moving particles in the water, using the MITgcm. The output of this looks something like this for each time step:

   -9999 0.0000000000000000000  #Time step (0.00000000000000 seconds)
 1308.2021183321899       -14.999709364517091 # Particle 1 (X,Z)
 1308.2020142528656       -24.999521595698688 # Particle 2 (X,Z)
 1308.2018600072618       -34.999345597877536 # .
 1308.2016593336587       -44.999185870669805 # .
 1308.2014165588744       -54.999046508237896 # .
 1308.2011370083103       -64.998931076248894
 1308.2008269116873       -74.998842490305705
 1308.2004933548124       -84.998782925797485
 1308.2001441978532       -94.998753764086956
 1308.1997879652938       -104.99875557384759
 1308.1994336881464       -114.99878812280582
 1308.1990906721119       -124.99885041328211
 1308.1987681881285       -134.99894073461562
 1308.1984750963150       -144.99905672694641
 1308.1982194336249       -154.99919545294702
 1308.1980080134056       -164.99935347476733
 1308.1978461242272       -174.99952693694112
 1308.1977378137256       -184.99971163492469
 1308.2000000000000       -195.00000000000000
 5232.8000000000002       -15.000038916290352
 5232.8000000000002       -25.000064153684303
 5232.8000000000002       -35.000089286157163
 5232.8000000000002       -45.000114270293523
 5232.8000000000002       -55.000139061712051 # Particle 57

Where -9999 #number is the time step (in seconds), left column is X position and right column is Z position (in meters); and every line is a different particle (except the -9999 one). So we'll have an enormous amount of lines with something like this for every time step and every particle.

I would like to plot the time-evolution of the position of my particles. How can I do it? If that's too hard, I would be happy with static plots of different time-steps with all particles position.

Thank you so much.

Edit1: What I tried to do is this, but I didn't show it before because it is far from proper:

 from matplotlib import numpy
 import matplotlib.pyplot as plot
 plot.plot(*np.loadtxt('data.dat',unpack=True), linewidth=2.0)

or this:

 plot.plotfile('data.dat', delimiter=' ', cols=(0, 1), names=('col1', 'col2'), marker='o')

Upvotes: 1

Views: 13803

Answers (1)

I would use numpy.loadtxt for reading input, but only because post-processing would also need numpy. You can read all your data to memory, then find the separator lines, then reshape the rest of your data to fit your number of particles. The following assumes that none of the particles ever reach exactly x=-9999, which should be a reasonable (although not foolproof) assumption.

import numpy as np
filename = 'input.dat'
indata = np.loadtxt(filename, usecols=(0,1)) # make sure the rest is ignored
tlines_bool = indata[:,0]==-9999
Nparticles = np.diff(np.where(tlines_bool)[0][:2])[0] - 1
# TODO: error handling: diff(np.where(tlines_bool)) should be constant
times = indata[tlines_bool,1]
positions = indata[np.logical_not(tlines_bool),:].reshape(-1,Nparticles,2)

The above code produces an Nt-element array times and an array position of shape (Nt,Nparticles,2) for each particle's 2d position at each time step. By computing the number of particles, we can let numpy determine the size of the first dimension (this iswhat the -1 index in reshape() is for).

For plotting you just have to slice into your positions array to extract what you exactly need. In case of 2d x data and 2d y data, matplotlib.pyplot.plot() will automatically try to plot the columns of the input arrays as a function of each other. Here's an example of how you can visualize, using your actual input data:

import matplotlib.pyplot as plt
t_indices = slice(None,None,500)  # every 500th time step
particle_indices = slice(None)    # every particle
#particle_indices = slice(None,5)   # first 5 particles
#particle_indices = slice(-5,None)  # last 5 particles

plt.figure()
_ = plt.plot(times[myslice],positions[myslice,particle_indices,0])
plt.xlabel('t')
plt.ylabel('x')

plt.figure()
_ = plt.plot(times[myslice],positions[myslice,particle_indices,1])
plt.xlabel('t')
plt.ylabel('z')

plt.figure()
_ = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1])
plt.xlabel('x')
plt.ylabel('z')
plt.show()

x(t) plot z(t) plot z(x) plot

Each line corresponds to a single particle. The first two plots show the time-evolution of the x and z components, respectively, and the third plot shows the z(x) trajectories. Note that there are a lot of particles in your data that don't move at all:

>>> sum([~np.diff(positions[:,k,:],axis=0).any() for k in range(positions.shape[1])])
15

(This computes the time-oriented difference of both coordinates for each particle, one after the other, and counts the number of particles for which every difference in both dimensions is 0, i.e. the particle doesn't move.). This explains all those horizontal lines in the first two plots; these stationary particles don't show up at all in the third plot (since their trajectory is a single point).

I intentionally introduced a bit fancy indexing which makes it easier to play around with your data. As you can see, indexing looks like this: times[myslice], positions[myslice,particle_indices,0], where both slices are defined in terms of...well, a slice. You should look at the documentation, but the short story is that arr[slice(from,to,stride)] is equivalent to arr[from:to:stride], and if any of the variables is None, then the corresponding index is empty: arr[slice(-5,None)] is equivalent to arr[-5:], i.e. it will slice the final 5 elements of the array.

So, in case you use a reduced number of trajectories for plotting (since 57 is a lot), you might consider adding a legend (this only makes sense as long as the default color cycle of matplotlib lets you distinguish between particles, otherwise you have to either set manual colors or change the default color cycle of your axes). For this you will have to keep the handles that are returned from plot:

particle_indices = slice(None,5)   # first 5 particles
plt.figure()
lines = plt.plot(positions[myslice,particle_indices,0],positions[myslice,particle_indices,1])
plt.xlabel('x')
plt.ylabel('z')
plt.legend(lines,['particle {}'.format(k) for k in range(len(t))])
plt.show()

example with legend

Upvotes: 5

Related Questions