Reputation: 239
I am trying to understand how np.mean is working in the following example:
n = np.array([1,7,5,4])
m = pd.Series(data = [1,2,3,4],index = [1,2,3,4])
print(np.mean(n!=m))
## returns 0.5
print(np.mean(n[n!=m]))
## returns 4.0
could someone explain how the first output is working out the value 0.5 and how np.mean is treating the boolean series n!=m? I understand what the second part is doing
Upvotes: 1
Views: 4134
Reputation: 1342
n !=m returns a boolean array .True and False values are treated as 1,0 respectively. In this case (0+1+1,0)/4 yields 0.5
>>> n != m
1 False
2 True
3 True
4 False
dtype: bool
n[n!=m] array([1, 7, 7, 1])
it means applying boolean selection on 'n'. it returns the [1,7,7,1]/4 yields 4
>>> condition = [n !=m]
>>> condition
[1 False
2 True
3 True
4 False
dtype: bool]
>>> n[condition]
array([1, 7, 7, 1])
Upvotes: 4