Luigi Plinge
Luigi Plinge

Reputation: 51109

Join on column in pandas

So this works as expected:

df1 = pd.DataFrame({'date':[123,456],'price1':[23,34]}).set_index('date')
df2 = pd.DataFrame({'date':[456,789],'price2':[22,32]}).set_index('date')
df1.join(df2, how='outer')

      price1  price2
date                
123     23.0     NaN
456     34.0    22.0
789      NaN    32.0

But if I don't set the index, it causes an error:

df1 = pd.DataFrame({'date':[123,456],'price1':[23,34]})
df2 = pd.DataFrame({'date':[456,789],'price2':[22,32]})
df1.join(df2, on='date', how='outer')

ValueError: columns overlap but no suffix specified: Index(['date'], dtype='object')

Why is this, and am I incorrect for supposing they should give the same result?

Upvotes: 1

Views: 758

Answers (1)

gtomer
gtomer

Reputation: 6574

If you want just to add the two dataframes and not joining by a certain column, you need to add suffixes so not to create columns with the same name. e.g.:

df1.join(df2, how='outer', lsuffix='_left', rsuffix='_right')

if you want to join on the column you should use merge:

df1.merge(df2, how='outer')

Upvotes: 2

Related Questions