ElaR
ElaR

Reputation: 13

Pandas take mean only if a certain number of numerical values is present

I have a dataset with triplicate values, like this:

ID  x   y   z
A   1   NA  NA
A   1   1   0.6
A   1   NA  0.4
B   NA  NA  NA
B   NA  0.5 1
B   NA  0.5 NA
...

I would like to take means of the triplicates for A and B, but only if there are two or more numerical values for each column and group. So the result should look like this:

ID  x   y   z
A   1   NA  0.5
B   NA  0.5 NA

Using mean with groupby results with taking all the column values. How to add a disclaimer that calculates mean only if a certain number of numerical values is present?

Upvotes: 1

Views: 1190

Answers (2)

Aminul
Aminul

Reputation: 1647

Here is another solution that might helps:

enter image description here

enter image description here

Upvotes: 0

BENY
BENY

Reputation: 323356

We can do min_count with sum, then divide with count. PS: interesting we do not have min_count in mean

s=df.groupby('ID').sum(min_count=2)/df.groupby('ID').count()
Out[178]: 
      x    y    z
ID               
A   1.0  NaN  0.5
B   NaN  0.5  NaN

Upvotes: 5

Related Questions