Reputation: 604
Taking the following dataframe as an example:
a
2011-01-01 00:00:00 9
2011-01-01 01:00:00 4
2011-01-01 02:00:00 2
2011-01-01 03:00:00 5
2011-01-01 04:00:00 3
2011-01-01 05:00:00 7
2011-01-01 06:00:00 8
2011-01-01 07:00:00 4
I want to process chunks of "a" column with a moving window, using the function autoCorr() below to generate "b" column accordingly.
def autoCorr(x):
return np.corrcoef(x[:-1], x[1:])[0,1]
For a moving window of length 5, the final result should then be:
a b
2011-01-01 00:00:00 9 0.0000
2011-01-01 01:00:00 4 0.0000
2011-01-01 02:00:00 2 0.0000
2011-01-01 03:00:00 5 0.0000
2011-01-01 04:00:00 3 -0.0877 # autoCorr([9,4,2,5,3])
2011-01-01 05:00:00 7 -0.6404 # autoCorr([4,2,5,3,7])
2011-01-01 06:00:00 8 0.2880 # autoCorr([2,5,3,7,8])
2011-01-01 07:00:00 4 -0.1580 # autoCorr([5,3,7,8,4])
Upvotes: 1
Views: 214
Reputation: 863421
Use Rolling.apply
+ fillna
:
df = df.rolling(5).apply(autoCorr).fillna(0)
print (df)
a
2011-01-01 00:00:00 0.000000
2011-01-01 01:00:00 0.000000
2011-01-01 02:00:00 0.000000
2011-01-01 03:00:00 0.000000
2011-01-01 04:00:00 -0.087706
2011-01-01 05:00:00 -0.640445
2011-01-01 06:00:00 0.288136
2011-01-01 07:00:00 -0.157877
df = df.rolling(5).apply(lambda x: pd.Series(x).autocorr(1)).fillna(0)
print (df)
a
2011-01-01 00:00:00 0.000000
2011-01-01 01:00:00 0.000000
2011-01-01 02:00:00 0.000000
2011-01-01 03:00:00 0.000000
2011-01-01 04:00:00 -0.087706
2011-01-01 05:00:00 -0.640445
2011-01-01 06:00:00 0.288136
2011-01-01 07:00:00 -0.157877
EDIT:
df1 = df.join(df.rolling(5).apply(autoCorr).fillna(0)['a'].rename('b'))
print (df1)
a b
2011-01-01 00:00:00 9 0.000000
2011-01-01 01:00:00 4 0.000000
2011-01-01 02:00:00 2 0.000000
2011-01-01 03:00:00 5 0.000000
2011-01-01 04:00:00 3 -0.087706
2011-01-01 05:00:00 7 -0.640445
2011-01-01 06:00:00 8 0.288136
2011-01-01 07:00:00 4 -0.157877
Upvotes: 1