sakthivinoth
sakthivinoth

Reputation: 41

get the matching rows for multiple columns pandas dataframe

Consider my first data frame df1

      col1 col2  col3
0    hello    q     1
1    world    q     2
2  welcome    r     3
3    hello    t     4

And second data frame df2

    col1 col2  col3
0  hello    q     2

Need output like

      col1 col2  col3
0    hello    q     2
1    world    q     2
2  welcome    r     3
3    hello    t     4

'col1' and 'col2' should be equal and if 'col3' differs get the output and replace the value in dataframe first

I tried to use merge

df1.merge(df2, on=['col1', 'col2'])

    col1 col2  col3_x  col3_y
0  hello    q       1       2

But I don't know what to do next.

Upvotes: 0

Views: 93

Answers (1)

piRSquared
piRSquared

Reputation: 294218

Option 1

merge, append, drop_duplicates

df1.drop('col3', 1).merge(df2).append(df1).drop_duplicates(['col1', 'col2'])

      col1 col2  col3
0    hello    q     2
1    world    q     2
2  welcome    r     3
3    hello    t     4

Option 2

set_index and combine_first

cols = ['col1', 'col2']
df2.set_index(cols).combine_first(
    df1.set_index(cols)
).reset_index().astype(df1.dtypes)

      col1 col2  col3
0    hello    q     2
1    hello    t     4
2  welcome    r     3
3    world    q     2

Upvotes: 1

Related Questions