Vichtor
Vichtor

Reputation: 197

Is it possible to do running correlation with one fixed series in Python?

I'm wondering if there is a fast way to do running correlation in Python with one fixed series? I've tried to use Pandas and for example: df1.rolling(4).corr(df2). However, it requires two DataFrames to have the same length. Is there a way to do similiar to the above Pandas example, but with one DataFrame being fixed?

To clarify, I would want to calculate the correlation coefficent between df2 below and the values in df1.

Example: First correlation between df2 and df1.loc[0:3] Second correlation between df2 and df1.loc[1:4]

etc.

I've managed to do this by creating a loop. However, I find it inefficent when working with larger DataFrames.

df1 = pd.DataFrame([1,3,2,4,5,6,3,4])
df2 = pd.DataFrame([1,2,3,2])

Upvotes: 2

Views: 646

Answers (1)

Niko Fohr
Niko Fohr

Reputation: 33770

You can use the pandas.DataFrame.rolling which returns pandas.core.window.Rolling which has apply method. Then you could pass to apply() any function that calculates the correction you want.

Example

import pandas as pd
from scipy.stats import pearsonr 
import numpy as np 


df1 = pd.DataFrame([1,3,2,4,5,6,3,4,1,2,3,2,2,3,2,5,1,2,1,2,8,8,8,8,8,8,8])
df2 = pd.DataFrame([1,2,3,2])

CORR_VALS = df2[0].values
def get_correlation(vals):
    return pearsonr(vals, CORR_VALS)[0]

df1['correlation'] = df1.rolling(window=len(CORR_VALS)).apply(get_correlation)

  • Note that the window argument in the df1.rolling() should have the same length as the array you are calculating correlation against.

this outputs

In [5]: df1['correlation'].values
Out[5]:
array([        nan,         nan,         nan,  0.31622777,  0.31622777,
        0.71713717,  0.63245553, -0.63245553, -0.39223227, -0.63245553,
       -0.63245553,  1.        ,  0.        , -0.70710678,  0.81649658,
        0.        ,  0.47809144, -0.23570226, -0.64699664,  0.        ,
        0.        ,  0.7570333 ,  0.76509206,  0.11043153, -0.77302068,
       -0.11043153,  0.86164044])

which would look like this:

enter image description here

Upvotes: 2

Related Questions