Reputation: 160
if i write this code:
train['id_03'].value_counts(dropna=False, normalize =True).head()
I am getting
NaN 0.887689233582822
0.0 0.108211128797372
1.0 0.001461374335354
3.0 0.001131168083449
2.0 0.000712906831036
Name: id_03, dtype: float64
If i changed dropna =True
I get
0.0 0.963497
1.0 0.013012
3.0 0.010072
2.0 0.006348
5.0 0.001643
Name: id_03, dtype: float64
Upvotes: 0
Views: 99
Reputation: 99
You are normalising the result. The value for the NaN
would appear to be very large with respect to the others. Hence the other indexes result in very small numbers
If you look at the relative ratio between indexes 1 & 2 you'll see that they are equivalent in both results.
Upvotes: 1
Reputation: 1136
I think the key is that you specified normalize =True
It is: "If True then the object returned will contain the relative frequencies of the unique values."
according to the documentations.
Before you removed Na's the Na's counts are used to calculate the relative frequencies, after you removed them, the denominator of the relative frequencies changed, hence values changed
Upvotes: 2