user21988
user21988

Reputation: 67

Creating DataFrame with columns related to inputdata

I have one large DataFrame (Pair3):

    DatetimeIndex: 4062 entries, 1997-06-06 00:00:00 to 2013-09-13 00:00:00
    Data columns (total 3 columns):
    A         4062  non-null values
    G         4062  non-null values
    S         4062  non-null values
    etc.

I would like to calculate Correlation and rolling Correlation of different Pairs. Therefore, I made:

   pairs = ([Pair3.A, Pair3.G], [Pair3.A, Pair3.S])

I calculated correlation of those pairs, with this function:

   tresults = []
   def correlation(x):
       for i in pairs:
            tresults.append(np.corrcoef(i)[1][0])

obtaining:

   tresults
   Out[161]: [0.94756275037713467, 0.91061348701825506]
                   (Correlation AG , Correlation AS)

My questions:

  1. I would like to create a DataFrame - named Correlation - with the columns automatically named regarding the considered pair, such as Correlation A-G, Correlation A-S etc and the corresponding tresult values

A table like that:

    Correlation AG     ,  Correlation AS
    0.94756275037713467,  0.91061348701825506

Do I need to to this by hand?

Upvotes: 0

Views: 85

Answers (1)

Jeff
Jeff

Reputation: 129028

This cross-computes all pairs with rolling, returns a Panel of the results. See docs here

In [18]: df = DataFrame(randn(100,3),columns=list('ABC'),index=date_range('20130101',periods=100))

In [19]: pd.rolling_corr_pairwise(df,50,10)
Out[19]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 100 (items) x 3 (major_axis) x 3 (minor_axis)
Items axis: 2013-01-01 00:00:00 to 2013-04-10 00:00:00
Major_axis axis: A to C
Minor_axis axis: A to C

In [20]: pd.rolling_corr_pairwise(df,50,10).loc[:,'A','C']
Out[20]: 
2013-01-01         NaN
2013-01-02         NaN
2013-01-03         NaN
2013-01-04         NaN
2013-01-05         NaN
2013-01-06         NaN
2013-01-07         NaN
2013-01-08         NaN
2013-01-09         NaN
2013-01-10   -0.380174
2013-01-11   -0.368027
2013-01-12   -0.256105
2013-01-13   -0.208781
2013-01-14   -0.209550
2013-01-15   -0.188442
...
2013-03-27   -0.147510
2013-03-28   -0.130810
2013-03-29   -0.139143
2013-03-30   -0.149664
2013-03-31   -0.117451
2013-04-01   -0.129279
2013-04-02   -0.119471
2013-04-03   -0.040025
2013-04-04   -0.045022
2013-04-05   -0.025215
2013-04-06   -0.048226
2013-04-07   -0.048213
2013-04-08   -0.046223
2013-04-09   -0.060886
2013-04-10   -0.032557
Freq: D, Name: C, Length: 100, dtype: float64

Upvotes: 3

Related Questions