Reputation: 29
I'm trying to do clean a data set, fill missing values, etc. I noticed that for "sex", some values are missing and instead of just filling the missing values with whatever is the most frequent, I want to fill the missing values with a ratio of males to females.
The following does not work, but is as close as I've gotten so far.
CleanedDF[CleanedDF['sex'] == 'NA'][:1000].replace('NA', 'Female', inplace=True)
Unfortuantely, when I run len (test[CleanedDF['sex'] == 'NA'])
, I can see that my original dataset is unchanged. I learned that this is because its creating a copy of the DF and updating that.
I've also tried, to no avail: CleanedDF.loc[CleanedDF['sex'] == 'NA', 'sex'][:1035] = 'Female'
Upvotes: 1
Views: 38
Reputation: 71610
Try using np.nan
and fillna
with limit
:
CleanedDF['sex'] = CleanedDF['sex'].replace('NA', np.nan).fillna('Female', limit=1000)
Incase you want to change NaN
back to NA
, try:
CleanedDF['sex'] = CleanedDF['sex'].replace('NA', np.nan).fillna('Female', limit=1000).fillna('NA')
Upvotes: 1