Reputation: 1571
I want to do mean of rows of numpy matrix. So for the input:
array([[ 1, 1, -1],
[ 2, 0, 0],
[ 3, 1, 1],
[ 4, 0, -1]])
my output will be:
array([[ 0.33333333],
[ 0.66666667],
[ 1.66666667],
[ 1. ]])
I came up with a solution result = array([[x] for x in np.mean(my_matrix, axis=1)])
, but this function will be called a lots of times on matrices of 40rows x 10-300 columns, so i would like to make it faster, and this implementation seems slow
Upvotes: 2
Views: 1064
Reputation: 114579
If the matrices are fresh and independent there isn't much you can save because the only way to compute the mean is to actually sum the numbers.
If however the matrices are obtained from partial views of a single fixed dataset (e.g. you're computing a moving average) the you can use a sum table. For example after:
st = data.cumsum(0)
you can compute the average of the elements between index x0
and x1
with
avg = (st[x1] - st[x0]) / (x1 - x0)
in O(1) (i.e. the computing time doesn't depends on how many elements you are averaging).
You can even use numpy to compute an array with the moving averages directly with:
res = (st[n:] - st[:-n]) / n
This approach can even be extended to higher dimensions like computing the average of the values in a rectangle in O(1) with
st = data.cumsum(0).cumsum(1)
rectsum = (st[y1][x1] + st[y0][x0] - st[y0][x1] - st[y1][x0])
Upvotes: 0
Reputation: 251116
You can do something like this:
>>> my_matrix.mean(axis=1)[:,np.newaxis]
array([[ 0.33333333],
[ 0.66666667],
[ 1.66666667],
[ 1. ]])
Upvotes: 2