user53526356
user53526356

Reputation: 968

How to Conditionally Merge Dataframe Columns

I have two dfs:

df1

  Person   Dept      Date    Company   ID  Value
0   Faye  Sales  12/31/16  FB Co Inc  123     27
1   Faye  Sales   3/31/17    Unknown  123     34
2    Ray    Eng   3/31/18     xyz co  345     59
3    Ray     HR   6/30/18        XyZ  345     54
4    Jay     HR   9/30/18          A  678     53
5    Jim    Ops   9/30/16        New  999      8

and df2

    Company     Symbol   ID
0    FB Inc         FB  123
1  XYZ Corp  No Symbol  345
2     A LLC         AA  678
3  EFG Corp        EFG  555

I want to:

  1. merge on ID
  2. replace Company in df1, with the neatly-formatted one in df2(but also preserve Company from df1 if there's no match in df2)
  3. add the Symbol from df2 to df1

so that the output looks like:

  Person   Dept      Date    Company    Symbol   ID  Value
0   Faye  Sales  12/31/16     FB Inc        FB  123     27
1   Faye  Sales   3/31/17     FB Inc        FB  123     34
2    Ray    Eng   3/31/18   XYZ Corp  No Symbol 345     59
3    Ray     HR   6/30/18   XYZ Corp  No Symbol 345     54
4    Jay     HR   9/30/18      A LLC         AA 678     53
5    Jim    Ops   9/30/16        New        NaN 999      8

Doing df3 = pd.merge(df1, df2, on='ID', how='left').drop('Company_x', axis=1) gets me close, but doesn't preserve Company from df1 when a match isn't found in df2 (Nan vs New):

  Person   Dept      Date   ID  Value Company_y     Symbol
0   Faye  Sales  12/31/16  123     27    FB Inc         FB
1   Faye  Sales   3/31/17  123     34    FB Inc         FB
2    Ray    Eng   3/31/18  345     59  XYZ Corp  No Symbol
3    Ray     HR   6/30/18  345     54  XYZ Corp  No Symbol
4    Jay     HR   9/30/18  678     53     A LLC         AA
5    Jim    Ops   9/30/16  999      8       NaN        NaN

How do I fix this?

Upvotes: 0

Views: 36

Answers (2)

David Erickson
David Erickson

Reputation: 16683

did this work as well with np.where and .notnull() logic? Your input wasn't copy/pasting over well.

df3 = pd.merge(df1, df2, on='ID', how='left')
df3['Company_x'] = np.where(df3['Company_y'].notnull(),
                            df3['Company_y'], df3['Company_x'])
df3 = df3.drop('Company_x', axis=1)

Upvotes: 0

BENY
BENY

Reputation: 323376

Just change you code with fillna

df3 = pd.merge(df1, df2, on='ID', how='left')
df3.Company_y.fillna(df3.Company_x, inplace=True) # here is fillna with two value 
df3=df3.drop('Company_x', 1)

Upvotes: 1

Related Questions