xiiais
xiiais

Reputation: 9

Comparing two dataframes with same columns and different rows

i have 2 dataframes df1, df2 and i want to compare these dataframes.

import pandas as pd

df1 = pd.DataFrame({'Column1': ['f','c','b','d','e','g','h'], 
                    'Column2': ['1','2','3','4','5','7','8']})

df2 = pd.DataFrame({'Column1': ['f','b','d','e','a','g'], 
                    'Column2': ['1','3','4','5','6','7']})

To compare dataframes, i use pandas merge. here's my code.

df = pd.merge(df1,df2, how="outer", on="Column1")

And Result :

  Column1 Column2_x Column2_y
  0       f         1         1
  1       c         2       NaN
  2       b         3         3
  3       d         4         4
  4       e         5         5
  5       g         7         7
  6       h         8       NaN
  7       a       NaN         6

But i don't want this result... the output what i want is below :

how can i get the output???

  What i want :

  Column1 Column2_x Column2_y
  0       f         1         1
  1       c         2       NaN
  2       b         3         3
  3       d         4         4
  4       e         5         5
  5       a       NaN         6
  6       g         7         7
  7       h         8       NaN

Upvotes: 1

Views: 54

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150815

It looks like you want to preserve the row order in df1. Try:

(df1.assign(enum=np.arange(len(df1)))
    .merge(df2, on='Column1', how='outer')
    .sort_values('enum')
    .drop(columns=['enum'])
)

Output:

  Column1 Column2_x Column2_y
0       f         1         1
1       c         2       NaN
2       b         3         3
3       d         4         4
4       e         5         5
5       g         7         7
6       h         8       NaN
7       a       NaN         6

Upvotes: 1

Related Questions