SimpleMind
SimpleMind

Reputation: 29

Replace a selection of a dataframe

Dataset value_counts()

I'm trying to do clean a data set, fill missing values, etc. I noticed that for "sex", some values are missing and instead of just filling the missing values with whatever is the most frequent, I want to fill the missing values with a ratio of males to females.

The following does not work, but is as close as I've gotten so far.
CleanedDF[CleanedDF['sex'] == 'NA'][:1000].replace('NA', 'Female', inplace=True)

Unfortuantely, when I run len (test[CleanedDF['sex'] == 'NA']), I can see that my original dataset is unchanged. I learned that this is because its creating a copy of the DF and updating that.

I've also tried, to no avail: CleanedDF.loc[CleanedDF['sex'] == 'NA', 'sex'][:1035] = 'Female'

Upvotes: 1

Views: 38

Answers (1)

U13-Forward
U13-Forward

Reputation: 71610

Try using np.nan and fillna with limit:

CleanedDF['sex'] = CleanedDF['sex'].replace('NA', np.nan).fillna('Female', limit=1000)

Incase you want to change NaN back to NA, try:

CleanedDF['sex'] = CleanedDF['sex'].replace('NA', np.nan).fillna('Female', limit=1000).fillna('NA')

Upvotes: 1

Related Questions