Reputation: 9
i have 2 dataframes df1, df2 and i want to compare these dataframes.
import pandas as pd
df1 = pd.DataFrame({'Column1': ['f','c','b','d','e','g','h'],
'Column2': ['1','2','3','4','5','7','8']})
df2 = pd.DataFrame({'Column1': ['f','b','d','e','a','g'],
'Column2': ['1','3','4','5','6','7']})
To compare dataframes, i use pandas merge. here's my code.
df = pd.merge(df1,df2, how="outer", on="Column1")
And Result :
Column1 Column2_x Column2_y
0 f 1 1
1 c 2 NaN
2 b 3 3
3 d 4 4
4 e 5 5
5 g 7 7
6 h 8 NaN
7 a NaN 6
But i don't want this result... the output what i want is below :
how can i get the output???
What i want :
Column1 Column2_x Column2_y
0 f 1 1
1 c 2 NaN
2 b 3 3
3 d 4 4
4 e 5 5
5 a NaN 6
6 g 7 7
7 h 8 NaN
Upvotes: 1
Views: 54
Reputation: 150815
It looks like you want to preserve the row order in df1
. Try:
(df1.assign(enum=np.arange(len(df1)))
.merge(df2, on='Column1', how='outer')
.sort_values('enum')
.drop(columns=['enum'])
)
Output:
Column1 Column2_x Column2_y
0 f 1 1
1 c 2 NaN
2 b 3 3
3 d 4 4
4 e 5 5
5 g 7 7
6 h 8 NaN
7 a NaN 6
Upvotes: 1