aquadzn
aquadzn

Reputation: 33

How to add two dataframes together based on a column using Pandas?

I've got two dataframes that I want to summurize in another.

df1 =

authorId,quest1, quest2, quest3, ...
1, xxx, xxx, xxx, ...
2, xxx, xxx, xxx, ...
3, xxx, xxx, xxx, ...
...

and

df2 =

authorId,answer1, answer2, answer3, ...
1, yyy, yyy, yyy, ...
2, yyy, yyy, yyy, ...
3, yyy, yyy, yyy, ...
...

I would like to have

df3 = 

authorId,quest1, quest2, quest3, answer1, answer2, answer3, 
1, xxx, xxx, xxx, yyy, yyy, yyy, ...
2, xxx, xxx, xxx, yyy, yyy, yyy, ...
3, xxx, xxx, xxx, yyy, yyy, yyy, ...
...

I already tried to merge or join (on inner, left, right, outer) but it doesn't work as expected.

df3 = df1.merge(df2, on='authorId', how='inner')

When I try to join i got an error :

You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

Upvotes: 0

Views: 68

Answers (1)

kaihami
kaihami

Reputation: 815

You can use the @anky_91 suggestion using concat or convert the authorID column to int.

df1['authorId'] = df1['authorId'].astype(int)
df2['authorId'] = df2['authorId'].astype(int)
df3 = df1.merge(df2, on='authorId', how='inner')

You can check the dataframe dtypes to see if the authorID column is the same type in both DFs.

df1.dtypes
df2.dtypes

Upvotes: 1

Related Questions