Generate Covariance in Pandas as a New Value

Question

I have a total of 4 columns in my dataframe. I would like to generate the Covariance between every row and its 1st lag row. How do I do the same? A subset of my dataframe along with the desired column 'Covariance' is given below:

 A1      A2      A3      A4      Covariance 
24.54   63.32   47.12   60.66      nan
16.33   64.39   49.72   56.94     73.43
46.38   55.94   42.35   73.43     50.27

Basically, the Covariance column should contain the value as if the COVARIANCE.P formula was used between a row and its lag in Microsoft EXCEL. As of now, I am generating new columns with the lag variables and trying to obtain the Covariance along a row of the data frame. This is the code that I have:

df = pd.DataFrame({'A1':[19, 20, 21],
               'A2':[24.54, 16.33, 46.38],
               'A3':[63.32, 64.39, 55.94],
               'A4':[47.12, 49.72, 42.35]})
df['Lag_A1'] = df['A1].shift(-1)
df['Lag_A2'] = df['A2].shift(-1)
df['Lag_A3'] = df['A3].shift(-1)
df['Lag_A4'] = df['A4].shift(-1)
Main = ['A1', 'A2', 'A3', 'A4']
Lag_1 = ['Lag_1_A1', 'Lag_1_A2', 'Lag_1_A3', 'Lag_1_A4']
Lag_1_Cov = df.apply(lambda x: x[Main].T.cov(x[Lag_1].T))

However, this code does not work. Please help.

BENY · Accepted Answer

If I understand.

df['Cov']=[df.iloc[:,:4].iloc[x].cov(df.iloc[:,:4].shift(1).iloc[x]) for x in range(len(df))]
df
Out[147]: 
      A1     A2     A3     A4  Covariance         Cov
0  24.54  63.32  47.12  60.66         NaN         NaN
1  16.33  64.39  49.72  56.94       73.43  369.981333
2  46.38  55.94  42.35  73.43       50.27  143.071233

Generate Covariance in Pandas as a New Value

Answers (2)

Related Questions