Barry
Barry

Reputation: 256

How to merge DataFrames only if one column meets a certain condition

Say I have a dataframe 'df_main':

ID   Type
1    Car
1    Truck
2    Truck

And another dataframe 'truck_data':

Truck_ID         Data
1          TruckData1
2          TruckData2

Merging these two gives the dataframe:

   ID   Type  Truck_ID        Data
0   1    Car         1  TruckData1
1   1  Truck         1  TruckData1
2   2  Truck         2  TruckData2

How can I modify the merge statement to only merge in the rows from truck_data that match on ID if and only if the Type == 'Truck'?

Desired output:

   ID   Type  Truck_ID        Data
0   1    Car         1  
1   1  Truck         1  TruckData1
2   2  Truck         2  TruckData2

Current code:

df_main = pd.DataFrame(columns=['ID', 'Type'], data=[[1, 'Car'], [1, 'Truck'], [2, 'Truck']])
truck_df = pd.DataFrame(columns=['Truck_ID', 'Data'], data=[[1, 'TruckData1'], [2, 'TruckData2']])
df_main = df_main.merge(truck_df, left_on='ID', right_on='Truck_ID', how='left')

Upvotes: 4

Views: 1742

Answers (2)

Nk03
Nk03

Reputation: 14949

Filter the required rows from df1 / perform merge /concat the unfiltered rows.

m = df_main.Type.eq('Truck')
merged_df = pd.concat([df_main.loc[~m], df_main.loc[m].merge(truck_df, left_on='ID', right_on='Truck_ID')]

OUTPUT:

   ID   Type        Data
0   1    Car         NaN
0   1  Truck  TruckData1
1   2  Truck  TruckData2

Upvotes: 4

Scott Boston
Scott Boston

Reputation: 153560

I'd do it like this:

df_main.merge(truck_df.assign(Type='Truck'), on=['ID', 'Type'], how='left')

Output:

   ID   Type        Data
0   1    Car         NaN
1   1  Truck  TruckData1
2   2  Truck  TruckData2

Details:

assign a temporary column to truck_df named 'Type' and merge on both 'ID' an 'Type' with how='left'

Upvotes: 3

Related Questions