Snark
Snark

Reputation: 1664

Pandas aggregate for counting with nans

The pandas count aggregate ignores nan's. I need a count that includes them. Numpy has aggregates for some but not all nan modified aggregates, do I have to use a custom aggregate or is there a way doing this that I can't find?

This is for groupby's, and I want the normal NaN functionality for mean, but weird for count. In code

In [1]: import numpy

In [2]: import pandas as pd

In [3]: df = pd.DataFrame([[0,float('nan')],[0,float('nan')],[0,float('nan')]])

In [4]: df.groupby(0).agg(['count', 'mean'])
Out[4]:
      1
  count mean
0
0     0  NaN

I want the output to be 3 Nan instead of 0 NaN.

Upvotes: 6

Views: 5158

Answers (2)

cncggvg
cncggvg

Reputation: 687

just use len()

size = lambda x: len(x)
df.groupby(0).agg(['count', 'mean', 'size'])

output:

      1          
  count mean size
0                
0     0  NaN    3

Upvotes: 4

dooms
dooms

Reputation: 1645

If your only problem is the count, you can replace NaN values like this :

In [17] : df = pd.DataFrame([0,NaN,3])
          print df.count()

Out [17]: 0    2
          dtype: int64


In [18] : marker = -1
          df = df.fillna(marker)
          print df.count()

Out [18]: 0    3
          dtype: int64

Upvotes: 1

Related Questions