Reputation: 51109
So this works as expected:
df1 = pd.DataFrame({'date':[123,456],'price1':[23,34]}).set_index('date')
df2 = pd.DataFrame({'date':[456,789],'price2':[22,32]}).set_index('date')
df1.join(df2, how='outer')
price1 price2
date
123 23.0 NaN
456 34.0 22.0
789 NaN 32.0
But if I don't set the index, it causes an error:
df1 = pd.DataFrame({'date':[123,456],'price1':[23,34]})
df2 = pd.DataFrame({'date':[456,789],'price2':[22,32]})
df1.join(df2, on='date', how='outer')
ValueError: columns overlap but no suffix specified: Index(['date'], dtype='object')
Why is this, and am I incorrect for supposing they should give the same result?
Upvotes: 1
Views: 758
Reputation: 6574
If you want just to add the two dataframes and not joining by a certain column, you need to add suffixes so not to create columns with the same name. e.g.:
df1.join(df2, how='outer', lsuffix='_left', rsuffix='_right')
if you want to join on the column you should use merge:
df1.merge(df2, how='outer')
Upvotes: 2