Reputation: 3379
I have a dataframe where for one column I want to fill null values with the index value. What is the best way of doing this?
Say my dataframe looks like this:
>>> import numpy as np
>>> import pandas as pd
>>> d=pd.DataFrame(index=['A','B','C'], columns=['Num','Name'], data=[[1,'Andrew'], [2, np.nan], [3, 'Chris']])
>>> print d
Num Name
A 1 Andrew
B 2 NaN
C 3 Chris
I can use the following line of code to get what I'm looking for:
d['Name'][d['Name'].isnull()]=d.index
However, I get the following warning: "A value is trying to be set on a copy of a slice from a DataFrame"
I imagine it'd be better to do this either using fillna or loc, but I can't figure out how to do this with either. I have tried the following:
>>> d['Name']=d['Name'].fillna(d.index)
>>> d.loc[d['Name'].isnull()]=d.index
Any suggestions on which is the best option?
Upvotes: 10
Views: 10419
Reputation: 394389
IMO you should use fillna
, as the Index
type is not an acceptable data type for the fill value you need to pass a series. Index
has a to_series
method:
In [13]:
d=pd.DataFrame(index=['A','B','C'], columns=['Num','Name'], data=[[1,'Andrew'], [2, np.nan], [3, 'Chris']])
d['Name']=d['Name'].fillna(d.index.to_series())
d
Out[13]:
Num Name
A 1 Andrew
B 2 B
C 3 Chris
Upvotes: 13
Reputation: 227
I would use .loc
in this situation like this:
d.loc[d['Name'].isnull(), 'Name'] = d.loc[d['Name'].isnull()].index
Upvotes: 5