mkmostafa
mkmostafa

Reputation: 3171

pandas median of count of a groupby

g  | val
1     a
1     ''
2     b
2     ''
2     c
3    ''

I have a df.groupby('g') and I want to select the median of the count of the non empty vals in each group. How to do that in pandas?

Upvotes: 0

Views: 689

Answers (4)

piRSquared
piRSquared

Reputation: 294508

Empty strings evaluate to False in a boolean context. And False evaluates to 0 in an integer context. We can use this to do

df.val.astype(bool).groupby(df.g).sum().median()

1.0

Upvotes: 0

BENY
BENY

Reputation: 323356

Is this what you need ? (Count will not count the NaN , this why we replace the '' to np.nan)

df.val=df.val.replace('',np.nan)
df
Out[243]: 
   g  val
0  1    a
1  1  NaN
2  2    b
3  2  NaN
4  2    c
5  3  NaN

df.groupby('g').val.count().median()
Out[245]: 1.0

Filter before groupby

df[df.val.isin(['','somethingealse'])].groupby('g').val.count().median()

Upvotes: 6

giograno
giograno

Reputation: 1809

You could just slice your frame excluding the empty values in the val column, then use groupby and count the median.

df[df['val']!=''].groupby('g').val.count().median()

Upvotes: 1

YOLO
YOLO

Reputation: 21749

Another way is by using apply function:

# inside apply, we can filter values
df.groupby('g')['val'].apply(lambda x: x[x!= ''].count()).median()
Out[2]: 1.0

Upvotes: 1

Related Questions