Anubhav Sinha
Anubhav Sinha

Reputation: 160

What is happening if I changed dropna to True/False

if i write this code:

train['id_03'].value_counts(dropna=False, normalize =True).head()

I am getting

NaN    0.887689233582822
0.0    0.108211128797372
1.0    0.001461374335354
3.0    0.001131168083449
2.0    0.000712906831036
Name: id_03, dtype: float64

If i changed dropna =True

I get

0.0    0.963497
1.0    0.013012
3.0    0.010072
2.0    0.006348
5.0    0.001643
Name: id_03, dtype: float64

Upvotes: 0

Views: 99

Answers (2)

B Man
B Man

Reputation: 99

You are normalising the result. The value for the NaN would appear to be very large with respect to the others. Hence the other indexes result in very small numbers

If you look at the relative ratio between indexes 1 & 2 you'll see that they are equivalent in both results.

Upvotes: 1

snowneji
snowneji

Reputation: 1136

I think the key is that you specified normalize =True It is: "If True then the object returned will contain the relative frequencies of the unique values." according to the documentations.

Before you removed Na's the Na's counts are used to calculate the relative frequencies, after you removed them, the denominator of the relative frequencies changed, hence values changed

Upvotes: 2

Related Questions