TPM
TPM

Reputation: 894

Correlation between two dataframes

Similar questions have been asked, but I've not seen a lucid answer. Forgive me for asking again. I have two dataframes, and I simply want the correlation of the first data frame with each column in the second. Here is code which does exactly what I want:

df1=pd.DataFrame( {'Y':np.random.randn(10) } )
df2=pd.DataFrame( {'X1':np.random.randn(10), 'X2':np.random.randn(10) ,'X3':np.random.randn(10) } )
for col in df2:
   print df1['Y'].corr(df2[col])

but it doesn't seem like I should be looping through the dataframe. I was hoping that something as simple as

df1.corr(df2) 

ought to get the job done. Is there a clear way to perform this function without looping?

Upvotes: 12

Views: 28912

Answers (1)

Alexander
Alexander

Reputation: 109546

You can use corrwith:

>>> df2.corrwith(df1.Y)
X1    0.051002
X2   -0.339775
X3    0.076935
dtype: float64

Upvotes: 20

Related Questions