How can I use a Pandas data structure to calculate autocorrelation?

Question

I have data in text files that I have successfully parsed into a MultiIndex pandas structure however I don't know if what I have will do what I want it to do.

What I have is a lot of time series data with many identifiers (indices). I ultimately need to calculate auto correlation times and other time series statistics on each time series.

#!/usr/bin/python

from pandas import Series, DataFrame, MultiIndex
...
data = Series(value, index=[smear, block, obser])
print data

print data.ix[('0.07','1','0')]

This produces output like this for the data structure:

0.07  0  0     1.5802561
         1    0.82228274
         2    0.70917131
         3    0.90707599
         4     0.8517223
         5    0.26346815
      1  0     1.8163109
         1     0.9972372
         2     1.0872181
         3     1.2459765
         4     1.1500478
         5    0.35668446
      2  0     2.0734421
         1     1.2863641
         2     1.4033583
...
0.34  2  3     1.9047537
         4     1.8193612
         5    0.77739654
      3  0     2.2757423
         1     1.5499509
         2     1.6623247
         3     1.8330889
         4     1.7484187
         5    0.72914635
      4  0     2.3269071
         1     1.7137621
         2     1.7359068
         3     1.9162268
         4     1.9714984
         5     1.2095218
Length: 32100

and the time series information I am interested in exists at a specified value of smear, block, obser. Here an example is given of smear = 0.07, block = 1, obser = 0. The right most column is my time series data.

0.07  1  0    1.8163109
         0    1.8191682
         0     1.816836
         0    1.8172168
         0    1.8169705
...
         0    1.8184542
         0    1.8170772
         0    1.8159326
         0    1.8161826
Length: 107

How do I reshape the data such that I can write functions that will calculate auto correlation times?

Dan · Accepted Answer

First, use the "values" component of your data.ix(whatever) object to get the raw array of the time series. Then use numpy.correlate to calculate the autocorrelation, using the method described in the answers to this question.

How can I use a Pandas data structure to calculate autocorrelation?

Answers (1)

Related Questions