Reputation: 1664
The pandas count aggregate ignores nan's. I need a count that includes them. Numpy has aggregates for some but not all nan modified aggregates, do I have to use a custom aggregate or is there a way doing this that I can't find?
This is for groupby's, and I want the normal NaN functionality for mean, but weird for count. In code
In [1]: import numpy
In [2]: import pandas as pd
In [3]: df = pd.DataFrame([[0,float('nan')],[0,float('nan')],[0,float('nan')]])
In [4]: df.groupby(0).agg(['count', 'mean'])
Out[4]:
1
count mean
0
0 0 NaN
I want the output to be 3 Nan instead of 0 NaN.
Upvotes: 6
Views: 5158
Reputation: 687
just use len()
size = lambda x: len(x)
df.groupby(0).agg(['count', 'mean', 'size'])
output:
1
count mean size
0
0 0 NaN 3
Upvotes: 4
Reputation: 1645
If your only problem is the count, you can replace NaN values like this :
In [17] : df = pd.DataFrame([0,NaN,3])
print df.count()
Out [17]: 0 2
dtype: int64
In [18] : marker = -1
df = df.fillna(marker)
print df.count()
Out [18]: 0 3
dtype: int64
Upvotes: 1