Reputation: 305
I have a triangular similarity matrix like this.
[[3, 1, 2, 0],
[1, 3, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 0]]
How do I calculate a weighted average for each row while discarding the zero elemets?
Upvotes: 0
Views: 141
Reputation: 6748
You can use numpy to calculate weighted average.
import numpy as np
a = np.array([
[3, 1, 2, 0],
[1, 3, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 0]
])
weights = np.array([1,2,3,4])
#create an mask where element is 0
ma = np.ma.masked_equal(a,0)
#take masked weighted average
ans = np.ma.average(ma, weights=weights,axis = 1)
#fill masked points as 0
ans.filled(0)
Output:
array([1.83333333, 2.33333333, 1. , 0. ])
Just Python:
ar = [[3, 1, 2, 0],
[1, 3, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 0]]
weight = [1,2,3,4]
ans=[]
for li in ar:
wa = 0 #weighted average
we = 0 #weights
for index,ele in enumerate(li):
if ele !=0:
wa+=weight[index]*ele
we+=weight[index]
if we!=0:
ans.append(wa/we)
else:
ans.append(0)
ans
Upvotes: 0
Reputation: 88236
You could add along the second axis, and divide by the sum
over the amount of non-zero values per row. Then with where
in np.divide
you can divide where a condition is satisfied, which by setting it to a mask specifying where non-zero values are, you can prevent getting a division by zero error:
a = np.array([[3, 1, 2, 0],
[1, 3, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 0]])
m = (a!=0).sum(1)
np.divide(a.sum(1), m, where=m!=0)
# array([2., 2., 1., 0.])
Upvotes: 2
Reputation: 657
Loop over each row, then loop over each element. When looping over the elements, don't include zeros. If you find only elements which are zero, just add zero (or whatever you want the default value to be) to your list.
weighted_averages = []
for row in matrix:
total_weight = 0
number_of_weights = 0
for element in row:
if element != 0:
total_weight += element
number_of_weights += 1
if number_of_weights == 0:
weighted_averages.append(0)
else:
weighted_averages.append(total_weight/number_of_weights)
weighted_averages
in your case comes back as:
[2.0, 2.0, 1.0, 0]
Upvotes: 0