Shri
Shri

Reputation: 307

filling out missing values in pandas dataframe

new to python and can't seem to find the exact answer I am looking for though I believe there is an easier way to fill this info

I have df1 and df2

df1: FirstName  LastName  PhNo  uniqueid

df2: uniqueid PhNo

I want to fill values missing in df1['PhNo'], with matching values in df2 based on matching uniqueid == uniqueid

Codes I used are as follows

dff = pd.merge(df1,df2,on = 'uniqueid', how = 'Left')
dff['PhNo'] = 0
dff['PhNo'][df1['PhNo_x'] >= 1] = df1['PhNo_x']
df1['PhNo'][df2['PhNo_y'] >= 1] = df1['PhNo_y']

this seems to do the work but does not seem like an efficient way of doing this. I am looking for a less number of lines and better technique than merge

df1

FirstName  LastName  PhNo    uniqueid
Sam        R         123x    1
John       S         345x    2
Paul       K         np.Nan  3
Laney      P         no.NaN  4

df2

uniqueid  PhNo
1         213x
3         675x
4         987x

desired output: df1

FirstName  LastName  PhNo    uniqueid
Sam        R         123x    1
John       S         345x    2
Paul       K         **675x**    3
Laney      P         **987x**    4

Upvotes: 1

Views: 334

Answers (2)

 younus
younus

Reputation: 474

DataFrame.fillna(value= &n)

Upvotes: 0

jezrael
jezrael

Reputation: 862396

I believe you need Series.map with Series.fillna:

df1 = pd.DataFrame({
        'FirstName':list('abcdef'),
        'LastName':list('aaabbb'),
         'PhNo':[7,np.nan,9,4,np.nan,np.nan],
         'uniqueid':[5,3,6,9,2,4],

})

print (df1)
  FirstName LastName  PhNo  uniqueid
0         a        a   7.0         5
1         b        a   NaN         3
2         c        a   9.0         6
3         d        b   4.0         9
4         e        b   NaN         2
5         f        b   NaN         4

df2 = pd.DataFrame({
         'PhNo':[10,90,30,20],
         'uniqueid':[3,6,9,4],

})
print (df2)
   PhNo  uniqueid
0    10         3
1    90         6
2    30         9
3    20         4

s = df2.set_index('uniqueid')['PhNo']
df1['PhNo'] = df1['PhNo'].fillna(df1['uniqueid'].map(s))
print (df1)
  FirstName LastName  PhNo  uniqueid
0         a        a   7.0         5
1         b        a  10.0         3
2         c        a   9.0         6
3         d        b   4.0         9
4         e        b   NaN         2
5         f        b  20.0         4

Upvotes: 4

Related Questions