Reputation: 1416
I have 2 dataframe listed as follow
df
Type Breed Common Color Other Color Behaviour
Golden Big Gold White Fun
Corgi Small Brown White Crazy
Bulldog Medium Black Grey Strong
df2
Type Breed Behaviour Bark Sound
Pug Small Sleepy Ak
German Shepard Big Cool Woof
Puddle Small Aggressive Ek
I wanted to merge 2 dataframe by columns Type
, Breed
and Behavior
.
Therefore, my desire output would be:
Type Breed Behavior
Golden Big Fun
Corgi Small Crazy
Bulldog Medium Strong
Pug Small Sleepy
German Shepard Big Cool
Puddle Small Aggressive
Upvotes: 3
Views: 101
Reputation: 863701
You need concat
:
print (pd.concat([df1[['Type','Breed','Behaviour']],
df2[['Type','Breed','Behaviour']]], ignore_index=True))
Type Breed Behaviour
0 Golden Big Fun
1 Corgi Small Crazy
2 Bulldog Medium Strong
3 Pug Small Sleepy
4 German Shepard Big Cool
5 Puddle Small Aggressive
More general is use intersection
for columns of both DataFrames
:
cols = df1.columns.intersection(df2.columns)
print (cols)
Index(['Type', 'Breed', 'Behaviour'], dtype='object')
print (pd.concat([df1[cols], df2[cols]], ignore_index=True))
Type Breed Behaviour
0 Golden Big Fun
1 Corgi Small Crazy
2 Bulldog Medium Strong
3 Pug Small Sleepy
4 German Shepard Big Cool
5 Puddle Small Aggressive
More general if df1
and df2
have no NaN
values use dropna
for removing columns with NaN
:
print (pd.concat([df1 ,df2], ignore_index=True))
Bark Sound Behaviour Breed Common Color Other Color Type
0 NaN Fun Big Gold White Golden
1 NaN Crazy Small Brown White Corgi
2 NaN Strong Medium Black Grey Bulldog
3 Ak Sleepy Small NaN NaN Pug
4 Woof Cool Big NaN NaN German Shepard
5 Ek Aggressive Small NaN NaN Puddle
print (pd.concat([df1 ,df2], ignore_index=True).dropna(1))
Behaviour Breed Type
0 Fun Big Golden
1 Crazy Small Corgi
2 Strong Medium Bulldog
3 Sleepy Small Pug
4 Cool Big German Shepard
5 Aggressive Small Puddle
Upvotes: 4
Reputation: 294536
using join
dropping columns that don't overlap
df1.T.join(df2.T, lsuffix='_').dropna().T.reset_index(drop=True)
Upvotes: 3