Reputation: 307
new to python and can't seem to find the exact answer I am looking for though I believe there is an easier way to fill this info
I have df1 and df2
df1: FirstName LastName PhNo uniqueid
df2: uniqueid PhNo
I want to fill values missing in df1['PhNo'], with matching values in df2 based on matching uniqueid == uniqueid
Codes I used are as follows
dff = pd.merge(df1,df2,on = 'uniqueid', how = 'Left')
dff['PhNo'] = 0
dff['PhNo'][df1['PhNo_x'] >= 1] = df1['PhNo_x']
df1['PhNo'][df2['PhNo_y'] >= 1] = df1['PhNo_y']
this seems to do the work but does not seem like an efficient way of doing this. I am looking for a less number of lines and better technique than merge
df1
FirstName LastName PhNo uniqueid
Sam R 123x 1
John S 345x 2
Paul K np.Nan 3
Laney P no.NaN 4
df2
uniqueid PhNo
1 213x
3 675x
4 987x
desired output: df1
FirstName LastName PhNo uniqueid
Sam R 123x 1
John S 345x 2
Paul K **675x** 3
Laney P **987x** 4
Upvotes: 1
Views: 334
Reputation: 862396
I believe you need Series.map
with Series.fillna
:
df1 = pd.DataFrame({
'FirstName':list('abcdef'),
'LastName':list('aaabbb'),
'PhNo':[7,np.nan,9,4,np.nan,np.nan],
'uniqueid':[5,3,6,9,2,4],
})
print (df1)
FirstName LastName PhNo uniqueid
0 a a 7.0 5
1 b a NaN 3
2 c a 9.0 6
3 d b 4.0 9
4 e b NaN 2
5 f b NaN 4
df2 = pd.DataFrame({
'PhNo':[10,90,30,20],
'uniqueid':[3,6,9,4],
})
print (df2)
PhNo uniqueid
0 10 3
1 90 6
2 30 9
3 20 4
s = df2.set_index('uniqueid')['PhNo']
df1['PhNo'] = df1['PhNo'].fillna(df1['uniqueid'].map(s))
print (df1)
FirstName LastName PhNo uniqueid
0 a a 7.0 5
1 b a 10.0 3
2 c a 9.0 6
3 d b 4.0 9
4 e b NaN 2
5 f b 20.0 4
Upvotes: 4