Reputation: 897
I have a dataframe:
x = pd.DataFrame({'1':[1,2,3,2,5,6,7,8,9], '2':[2,5,6,8,10,np.nan,6,np.nan,np.nan],
'3':[10,10,10,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan]})
I am trying to generate an average of a row but only on values greater than 5. For instance - if a row had values of 3, 6, 10. The average would be 8 ((6+10)/2). The 3 would be ignored as it is below 5.
The equivalent in excel would be =AVERAGEIF(B2:DX2,">=5")
Upvotes: 1
Views: 174
Reputation: 9277
You can create a small custom function which, inside each row, filters out values smaller or equal than a certain value
and apply
it to each row of your dataframe
def average_if(s, value=5):
s = s.loc[s > value]
return s.mean()
x.apply(average_if, axis=1)
0 10.0
1 10.0
2 8.0
3 8.0
4 10.0
5 6.0
6 6.5
7 8.0
8 9.0
dtype: float64
Upvotes: 2
Reputation: 75130
You can mask the values greater than 5 then take mean:
x.where(x>5).mean(1)
Or:
x.mask(x<=5).mean(1)
Upvotes: 3