Reputation: 13
I have a dataset with triplicate values, like this:
ID x y z
A 1 NA NA
A 1 1 0.6
A 1 NA 0.4
B NA NA NA
B NA 0.5 1
B NA 0.5 NA
...
I would like to take means of the triplicates for A and B, but only if there are two or more numerical values for each column and group. So the result should look like this:
ID x y z
A 1 NA 0.5
B NA 0.5 NA
Using mean
with groupby
results with taking all the column values. How to add a disclaimer that calculates mean only if a certain number of numerical values is present?
Upvotes: 1
Views: 1190
Reputation: 323356
We can do min_count
with sum
, then divide with count
. PS: interesting we do not have min_count
in mean
s=df.groupby('ID').sum(min_count=2)/df.groupby('ID').count()
Out[178]:
x y z
ID
A 1.0 NaN 0.5
B NaN 0.5 NaN
Upvotes: 5