Reputation: 67
I have one large DataFrame (Pair3):
DatetimeIndex: 4062 entries, 1997-06-06 00:00:00 to 2013-09-13 00:00:00
Data columns (total 3 columns):
A 4062 non-null values
G 4062 non-null values
S 4062 non-null values
etc.
I would like to calculate Correlation and rolling Correlation of different Pairs. Therefore, I made:
pairs = ([Pair3.A, Pair3.G], [Pair3.A, Pair3.S])
I calculated correlation of those pairs, with this function:
tresults = []
def correlation(x):
for i in pairs:
tresults.append(np.corrcoef(i)[1][0])
obtaining:
tresults
Out[161]: [0.94756275037713467, 0.91061348701825506]
(Correlation AG , Correlation AS)
My questions:
A table like that:
Correlation AG , Correlation AS
0.94756275037713467, 0.91061348701825506
Do I need to to this by hand?
Upvotes: 0
Views: 85
Reputation: 129028
This cross-computes all pairs with rolling, returns a Panel of the results. See docs here
In [18]: df = DataFrame(randn(100,3),columns=list('ABC'),index=date_range('20130101',periods=100))
In [19]: pd.rolling_corr_pairwise(df,50,10)
Out[19]:
<class 'pandas.core.panel.Panel'>
Dimensions: 100 (items) x 3 (major_axis) x 3 (minor_axis)
Items axis: 2013-01-01 00:00:00 to 2013-04-10 00:00:00
Major_axis axis: A to C
Minor_axis axis: A to C
In [20]: pd.rolling_corr_pairwise(df,50,10).loc[:,'A','C']
Out[20]:
2013-01-01 NaN
2013-01-02 NaN
2013-01-03 NaN
2013-01-04 NaN
2013-01-05 NaN
2013-01-06 NaN
2013-01-07 NaN
2013-01-08 NaN
2013-01-09 NaN
2013-01-10 -0.380174
2013-01-11 -0.368027
2013-01-12 -0.256105
2013-01-13 -0.208781
2013-01-14 -0.209550
2013-01-15 -0.188442
...
2013-03-27 -0.147510
2013-03-28 -0.130810
2013-03-29 -0.139143
2013-03-30 -0.149664
2013-03-31 -0.117451
2013-04-01 -0.129279
2013-04-02 -0.119471
2013-04-03 -0.040025
2013-04-04 -0.045022
2013-04-05 -0.025215
2013-04-06 -0.048226
2013-04-07 -0.048213
2013-04-08 -0.046223
2013-04-09 -0.060886
2013-04-10 -0.032557
Freq: D, Name: C, Length: 100, dtype: float64
Upvotes: 3