opt135
opt135

Reputation: 141

na not picking missing values?

Have a DF med562. A categorical variable has distribution below

I    6119923

O     764905

      166666

Name: IND, dtype: int64

Want to just impute the 166666 missing values using value of I, the one with 6119923 rows. Wrote this

med562['IND']=med562['IND'].fillna(value='I')

Catcounts=med562.IND.value_counts(dropna=False)

Catcounts

It did not change, still the same distribution. This is running on Python 3.7.3. Should not be software issue. Any thought? Thanks.

Upvotes: 1

Views: 42

Answers (1)

BENY
BENY

Reputation: 323226

That is not NaN, it is whitespace , if that is NaN when you do value_counts it will not show in the result , since dropna=True in value_counts defaulted as True

med562['IND']=med562['IND'].replace({'':'I'})

Catcounts=med562.IND.value_counts(dropna=False)

Upvotes: 1

Related Questions