Reputation: 11
I'm hoping for some help. I am trying to concatenate three dataframes in pandas with a multiindex. Two of them work fine, but the third keeps appending, instead of concatenating. They all have the same multiindex (I have tested this by df1.index.name == df2.index.name)
This is what I have tried:
df_final = pd.concat([df1, df2], axis = 1)
example:
df1
A B X
0 1 3
2 4
df2
A B Y
0 1 20
2 30
What I want to get is this:
df_final
A B X Y
0 1 3 20
2 4 30
But what I keep getting is this:
df_final
A B X Y
0 1 3 NaN
2 4 NaN
0 1 NaN 20
2 NaN 30
Any ideas? I have also tried
df_final = pd.concat([df1, df2], axis = 1, keys = ['A', 'B'])
But then df2 doesn't appear at all.
Thanks!
Upvotes: 1
Views: 3761
Reputation: 1157
Try doing
pd.merge(df1, df2)
join() may also be used for your problem, provided you add the 'key' column to all your dataframes.
Upvotes: 0
Reputation: 11
Thank you everyone for your help! With your suggestions, I tried merging, but I got a new error:
ValueError: You are trying to merge on int64 and object columns. If you wish to proceed you should use pd.concat
Which led me to find that one of the indexes in the dataframe that was appending was an object instead of an integer. So I've changed that and now the concat works!
This has taken me days to get through... So thank you again!
Upvotes: 0
Reputation: 1795
First way (and the better one in this case):
use merge:
pd.merge(left=df1, right=df2, on=['A','B'], how='inner')
Second way:
If you prefer using concat you can use groupby after it:
df_final = pd.concat([df1, df2])
df_final = df_final.groupby(['A','B']).first()
Upvotes: 2