Reputation: 83
I have 2 dataframes which I have to join. I am trying to use merge on columns ID and STATUS. However the problem is if for a row status is NULL in df2 I want it to still match it based on just ID and bring the name. If STATUS has a value match it else match just the ID and bring the name.
mer_col_list = ['ID','STATUS']
df_out = pd.merge(df1,df2, on=mer_col_list, how='left')
df1 |ID | STATUS | NAME | |11 | ACTIVE | John | |22 | DORMANT| NICK | |33 | NOT_ACTIVE| HARRY|
df2 |ID | STATUS | BRANCH | |11 | DORMANT| USA | |11 | | USA | |22 | | UK | |33 | NOT_ACTIVE | AUS|
df_out: |ID| NAME | BRANCH| |11| JOHN | USA | |22| NICK | USA | |33|HARRY | AUS |
Upvotes: 0
Views: 486
Reputation: 862761
You can create another left join by ID
only if STATUS
is missing and then combine both DataFrames by DataFrame.fillna
:
df_out1 = df1.merge(df2, on=['ID','STATUS'], how='left')
df_out2 = df1.merge(df2[df2['STATUS'].isna()].drop('STATUS',axis=1), on=['ID'],how='left')
df_out = df_out1.fillna(df_out2)
print (df_out)
ID STATUS NAME BRANCH
0 11 ACTIVE John USA
1 22 DORMANT NICK UK
2 33 NOT_ACTIVE HARRY AUS
You can also remove missing values first and duplicates if exist duplicated ['ID','STATUS']
for df2
and ID
if remove missing valeus per STATUS
:
df21 = df2.dropna(subset=['STATUS']).drop_duplicates(['ID','STATUS'])
df_out1 = df1.merge(df21, on=['ID','STATUS'], how='left')
df22 = df2[df2['STATUS'].isna()].drop('STATUS',axis=1).drop_duplicates(['ID'])
df_out2 = df1.merge(df22, on='ID',how='left')
df_out = df_out1.fillna(df_out2)
print (df_out)
Dynamic solution - if always ID
is in list mer_col_list
:
mer_col_list = ['ID','STATUS']
df21 = df2.dropna(subset=mer_col_list).drop_duplicates(mer_col_list)
df_out1 = df1.merge(df21, on=mer_col_list, how='left')
no_id_cols = np.setdiff1d(mer_col_list, ['ID'])
print (no_id_cols)
['STATUS']
df22=df2[df2[no_id_cols].isna().any(axis=1)].drop(no_id_cols,axis=1).drop_duplicates(['ID'])
df_out2 = df1.merge(df22, on='ID',how='left')
df_out = df_out1.fillna(df_out2)
print (df_out)
ID STATUS NAME BRANCH
0 11 ACTIVE John USA
1 22 DORMANT NICK UK
2 33 NOT_ACTIVE HARRY AUS
Upvotes: 1