Sharon Asayag
Sharon Asayag

Reputation: 89

find correlation between pandas time series

I have two pandas data frames which I have taken from only one column and set dates column as index, so now I have two Series instead. I need to find the correlation for those Series.

Here are a few rows fromdfd:

index      change
2018-12-31  -0.86
2018-12-30  0.34
2018-12-27  -0.94
2018-12-26  -1.26
2018-12-25  3.30
2018-12-24  -4.17

and from dfp:

index      change
2018-12-31  0.55
2018-12-30  0.81
2018-12-27  -2.99
2018-12-26  0.50
2018-12-25  3.59
2018-12-24  -3.43

I tried:

correlation=dfp.corr(dfd)

and got the following error:

TypeError: unsupported operand type(s) for /: 'str' and 'int'

Upvotes: 3

Views: 601

Answers (2)

wwnde
wwnde

Reputation: 26676

Can merge the two dataframes and correlate columns

dfd['date']=pd.to_datetime(dfd['date'])
dfd.set_index(dfd['date'], inplace=True)
dfd.drop(columns=['date'], inplace=True)

dfp['date']=pd.to_datetime(dfp['date'])
dfp.set_index(dfp['date'], inplace=True)
dfp.drop(columns=['date'], inplace=True)
df = pd.merge(dfp,dfd,left_index=True, right_index=True).reset_index()
df

enter image description here

Correlate on two columns change(dfd),(dfp)

df['change(dfp)'].corr(df['change(dfd)'])

Outcome

enter image description here

Upvotes: 1

jezrael
jezrael

Reputation: 862711

Problem is dfp is filled by string repr of numbers, so use Series.astype for convert to floats:

correlation=dfp.astype(float).corr(dfd.astype(float)
print (correlation)
0.8624789983270312

If some non numeric values solution abaove fail, then use to_numeric with errors='coerce' - non numbers are converted to missing values:

correlation=pd.to_numeric(dfp, errors='coerce').corr(dfd)

Upvotes: 5

Related Questions