Kendall
Kendall

Reputation: 58

How do you merge two dataframes and conditionally merge one column

I have 2 dataframes that are identical except for one column. I am hoping to merge the two together and conditionally accept the value of the column. In this case I am look for the max of the two, but in general any conditional would be ideal.

import pandas as pd

df1 = pd.DataFrame([['Tom', 30], ['Jane', 40], ['Barry', 22], ['Kelly', 15]])
df2 = pd.DataFrame([['Tom', 10], ['Jane', 50], ['Barry', 22]])

df1:

       0   1
0    Tom  30
1   Jane  40
2  Barry  22
3  Kelly  15

df2

       0   1
0    Tom  10
1   Jane  50
2  Barry  22

I am looking to end up with a data frame that merges the two and takes the max of column 1.

Example:

       0   1
0    Tom  30
1   Jane  50
2  Barry  22
3  Kelly  15

Upvotes: 0

Views: 52

Answers (2)

wwnde
wwnde

Reputation: 26676

Another way; append, sort_values and drop_duplicates. Code below

df2.append(df1).sort_values(by=['0',"1"],ascending = (False, True)).drop_duplicates(subset=['0'],keep='last')

      0   1
0    Tom  30
3  Kelly  15
1   Jane  50
2  Barry  22

Upvotes: 1

sammywemmy
sammywemmy

Reputation: 28649

Merge the data, setting how as outer, before grouping to get the max:

df1.merge(df2, how='outer').groupby(0, as_index = False, sort=False).max()

       0   1
0    Tom  30
1   Jane  50
2  Barry  22
3  Kelly  15

Upvotes: 1

Related Questions