Reputation: 256
Say I have a dataframe 'df_main':
ID Type
1 Car
1 Truck
2 Truck
And another dataframe 'truck_data':
Truck_ID Data
1 TruckData1
2 TruckData2
Merging these two gives the dataframe:
ID Type Truck_ID Data
0 1 Car 1 TruckData1
1 1 Truck 1 TruckData1
2 2 Truck 2 TruckData2
How can I modify the merge statement to only merge in the rows from truck_data that match on ID if and only if the Type == 'Truck'?
Desired output:
ID Type Truck_ID Data
0 1 Car 1
1 1 Truck 1 TruckData1
2 2 Truck 2 TruckData2
Current code:
df_main = pd.DataFrame(columns=['ID', 'Type'], data=[[1, 'Car'], [1, 'Truck'], [2, 'Truck']])
truck_df = pd.DataFrame(columns=['Truck_ID', 'Data'], data=[[1, 'TruckData1'], [2, 'TruckData2']])
df_main = df_main.merge(truck_df, left_on='ID', right_on='Truck_ID', how='left')
Upvotes: 4
Views: 1742
Reputation: 14949
Filter the required rows
from df1 / perform merge
/concat
the unfiltered rows.
m = df_main.Type.eq('Truck')
merged_df = pd.concat([df_main.loc[~m], df_main.loc[m].merge(truck_df, left_on='ID', right_on='Truck_ID')]
OUTPUT:
ID Type Data
0 1 Car NaN
0 1 Truck TruckData1
1 2 Truck TruckData2
Upvotes: 4
Reputation: 153560
I'd do it like this:
df_main.merge(truck_df.assign(Type='Truck'), on=['ID', 'Type'], how='left')
Output:
ID Type Data
0 1 Car NaN
1 1 Truck TruckData1
2 2 Truck TruckData2
Details:
assign
a temporary column to truck_df named 'Type' and merge on both 'ID' an 'Type' with how='left'
Upvotes: 3