Abhay Nainan
Abhay Nainan

Reputation: 4234

Computing Rolling autocorrelation using Pandas.rolling

I am attempting calculate the rolling auto-correlation for a Series object using Pandas (0.23.3)

Setting up the example:

dt_index = pd.date_range('2018-01-01','2018-02-01', freq = 'B')
data = np.random.rand(len(dt_index))
s = pd.Series(data, index = dt_index)

Creating a Rolling object with window size = 5:

r = s.rolling(5)

Getting:

Rolling [window=5,center=False,axis=0]

Now when I try to calculate the correlation (Pretty sure this is the wrong approach):

r.corr(other=r)

I get only NaNs

I tried another approach based on the documentation::

df = pd.DataFrame()
df['a'] = s
df['b'] = s.shift(-1)
df.rolling(window=5).corr()

Getting something like:

...
2018-03-01 a NaN NaN
           b NaN NaN

Really not sure where I'm going wrong with this. Any help would be immensely appreciated! The docs use float64 as well. Thinking it's because the correlation is very close to zero and so it's showing NaN? Somebody had raised a bug report here, but jreback solved the problem in a previous bug fix I think.

This is another relevant answer, but it's using pd.rolling_apply, which does not seem to be supported in Pandas version 0.23.3?

Upvotes: 9

Views: 7741

Answers (2)

BrunoF
BrunoF

Reputation: 3523

This is a lot faster than Pandas' autocorr but the results are different. In my dataset, there is a 0.87 Pearson correlation between the results of those two methods. There is a discussion about why the results are different here.

from statsmodels.tsa.stattools import acf
s.rolling(5).apply(lambda x: acf(x, unbiased=True, fft=False)[1], raw=True)

Note that the input cannot have null values, otherwise it will return all nulls.

Upvotes: 4

rafaelc
rafaelc

Reputation: 59274

IIUC,

>>> s.rolling(5).apply(lambda x: x.autocorr(), raw=False)

2018-01-01         NaN
2018-01-02         NaN
2018-01-03         NaN
2018-01-04         NaN
2018-01-05   -0.502455
2018-01-08   -0.072132
2018-01-09   -0.216756
2018-01-10   -0.090358
2018-01-11   -0.928272
2018-01-12   -0.754725
2018-01-15   -0.822256
2018-01-16   -0.941788
2018-01-17   -0.765803
2018-01-18   -0.680472
2018-01-19   -0.902443
2018-01-22   -0.796185
2018-01-23   -0.691141
2018-01-24   -0.427208
2018-01-25    0.176668
2018-01-26    0.016166
2018-01-29   -0.876047
2018-01-30   -0.905765
2018-01-31   -0.859755
2018-02-01   -0.795077

Upvotes: 14

Related Questions