MJS
MJS

Reputation: 1623

Python pandas correlation corr() TypeError: Could not compare ['pearson'] with block values

one = pd.DataFrame(data=[1,2,3,4,5], index=[1,2,3,4,5])

two = pd.DataFrame(data=[5,4,3,2,1], index=[1,2,3,4,5])

one.corr(two)

I think it should return a float = -1.00 but instead it's generating the following error:

TypeError: Could not compare ['pearson'] with block values

Thanks in advance for your help.

Upvotes: 2

Views: 10076

Answers (2)

Brian
Brian

Reputation: 138

You are operating on a DataFrame when you should be operating on a Series.

In [1]: import pandas as pd

In [2]: one = pd.DataFrame(data=[1,2,3,4,5], index=[1,2,3,4,5])

In [3]: two = pd.DataFrame(data=[5,4,3,2,1], index=[1,2,3,4,5])

In [4]: one
Out[4]:
   0
1  1
2  2
3  3
4  4
5  5

In [5]: two
Out[5]:
   0
1  5
2  4
3  3
4  2
5  1

In [6]: one[0].corr(two[0])
Out[6]: -1.0

Why subscript with [0]? Because that is the name of the column in the DataFrame, since you didn't give it one. When you reference a column in a DataFrame, it will return a Series, which is 1-dimensional. The documentation for this function is here.

Upvotes: 2

zero323
zero323

Reputation: 330063

pandas.DataFrame.corr computes pairwise correlation between the columns of a single data frame. What you need here is pandas.DataFrame.corrwith:

>>> one.corrwith(two)
0   -1
dtype: float64

Upvotes: 7

Related Questions