Reputation:
I have two dataframes with different shapes, I want to apply an conditional If statement to df1 and input values from df2. Df1 will have duplicate rows but I need them filled, which means replacing -9 values with the values from df2 in a particular column
df1:
Code 1 Name
2 Sam
5 James
7 Mark
6 Steven
-9 Michael
-9 Sarah
-9 Sam
5 James
-9 Mark
6 Steven
7 Michael
-9 Sarah
-9 Chris
df2:
Code 1 Name
20 Sam
30 James
40 Mark
50 Steven
70 Michael
45 Sarah
df1 Intended output:
Code 1 Name
2 Sam
5 James
7 Mark
6 Steven
70 Michael
45 Sarah
20 Sam
5 James
40 Mark
6 Steven
7 Michael
45 Sarah
-9 Chris
Upvotes: 1
Views: 298
Reputation: 107632
With an SQL mentality, consider a set-based approach with merge
, assign
, and np.where
conditional logic (counterparts to SQL's JOIN
, SELECT
, and CASE
):
df1 = (df1.merge(df2, on="Name", how='left', suffixes=['','_'])
.assign(Code1 = lambda x: (np.where((x['Code1'] == -9) & (pd.notna(x['Code1_'])),
x['Code1_'],
x['Code1'])).astype(int))
.drop(columns = ['Code1_'])
)
# Code1 Name
# 0 2 Sam
# 1 5 James
# 2 7 Mark
# 3 6 Steven
# 4 70 Michael
# 5 45 Sarah
# 6 20 Sam
# 7 5 James
# 8 40 Mark
# 9 6 Steven
# 10 7 Michael
# 11 45 Sarah
# 12 -9 Chris
Online Demo (click Run at top)
Upvotes: 0
Reputation: 862661
Use Series.map
for new Series and replace only matched rows by condition, last for non matched rows replace missing values to -9
:
m = df1['Code 1'] == -9
df1.loc[m, 'Code 1'] = df1.loc[m, 'Name'].map(df2.set_index('Name')['Code 1'])
df1['Code 1'] = df1['Code 1'].fillna(-9).astype(int)
print (df1)
Code 1 Name
0 2 Sam
1 5 James
2 7 Mark
3 6 Steven
4 70 Michael
5 45 Sarah
6 20 Sam
7 5 James
8 40 Mark
9 6 Steven
10 7 Michael
11 45 Sarah
12 -9 Chris
Upvotes: 1