Reputation: 145
For the sake of illustration, imaging I have the following ndarray:
x = [[0.5, 0.3, 0.1, 0.1],
[0.4, 0.1, 0.3, 0.2],
[0.4, 0.3, 0.2, 0.1],
[0.6, 0.1, 0.1, 0.2]]
I want to sum the two vectors at columns 1 and 2 (starting the count from 0) so that the new ndarray would be:
y = [[0.5, 0.4, 0.1],
[0.4, 0.4, 0.2],
[0.4, 0.5, 0.1],
[0.6, 0.2, 0.2]]
And then, I want to average the vectors at rows 1 and 2 so that the final result would be:
z = [[0.5, 0.4, 0.1 ],
[0.4, 0.45, 0.15],
[0.6, 0.2, 0.2 ]]
Is there an efficient way to do that in numpy in one command? I really need efficiency as this operation is going to be applied in a nested loop.
Thanks in advance
Upvotes: 0
Views: 888
Reputation: 3648
@hpaulj s solution is very good, be sure to read it
You can sum
columns quite easily:
a_summed = np.sum(a[:,1:3], axis=1)
You can also take the mean of multiple rows:
a_mean = np.mean(a[1:3], axis=0)
All you have to do is replace and delete the remaining columns, so it becomes:
import numpy as np
a_summed = np.sum(a[:,1:3], axis=1)
a[:, 1] = a_summed
a = np.delete(a, 2, 1)
a_mean = np.mean(a[1:3], axis=0)
a[1] = a_mean
a = np.delete(a, 2, 0)
print(a)
Upvotes: 2
Reputation: 231530
In [68]: x = [[0.5, 0.3, 0.1, 0.1],
...: [0.4, 0.1, 0.3, 0.2],
...: [0.4, 0.3, 0.2, 0.1],
...: [0.6, 0.1, 0.1, 0.2]]
In [69]: x=np.array(x)
ufunc
like np.add
have a reduceat
method that lets us perform the action over groups of rows or columns. With that the first reduction is easy (but takes a little playing to understand the parameters):
In [70]: np.add.reduceat(x,[0,1,3], axis=1)
Out[70]:
array([[0.5, 0.4, 0.1],
[0.4, 0.4, 0.2],
[0.4, 0.5, 0.1],
[0.6, 0.2, 0.2]])
Apparently mean
is not a ufunc
, so I had to settle for add
to reduce the rows:
In [71]: np.add.reduceat(Out[70],[0,1,3],axis=0)
Out[71]:
array([[0.5, 0.4, 0.1],
[0.8, 0.9, 0.3],
[0.6, 0.2, 0.2]])
and then divide by the row count to get the mean. I could generalize that to use the same [0,1,3]
used in the reduceat
, but for now just use a column array:
In [72]: np.add.reduceat(Out[70],[0,1,3],axis=0)/np.array([1,2,1])[:,None]
Out[72]:
array([[0.5 , 0.4 , 0.1 ],
[0.4 , 0.45, 0.15],
[0.6 , 0.2 , 0.2 ]])
and the whole thing in one expression:
In [73]: np.add.reduceat(np.add.reduceat(x,[0,1,3], axis=1),[0,1,3],axis=0)/ np.array([1,2,1])[:,None]
Out[73]:
array([[0.5 , 0.4 , 0.1 ],
[0.4 , 0.45, 0.15],
[0.6 , 0.2 , 0.2 ]])
Upvotes: 1
Reputation: 51
Since you are changing the original matrix size it would be better to do it in two steps as mentioned in the previous answers but, if you want to do it in one command, you could do it as follows and it makes for a nice generalized solution:
import numpy as np
x = np.array(([0.5, 0.3, 0.1, 0.1, 1],
[0.4, 0.1, 0.3, 0.2, 1],
[0.4, 0.3, 0.2, 0.1, 1],
[0.6, 0.1, 0.1, 0.2, 1]))
def sum_columns(matrix, col_start, col_end):
return np.column_stack((matrix[:, 0:col_start],
np.sum(matrix[:, col_start:col_end + 1], axis=1),
matrix[:, col_end + 1:]))
def avgRows_summedColumns(matrix, row_start, row_end):
return np.row_stack((matrix[0:row_start, :],
np.mean(matrix[row_start:row_end + 1, :], axis=0),
matrix[row_end:-1, :]))
# call the entire operation in one command
print(avgRows_summedColumns(sum_columns(x, 1, 2), 1, 2))
This way it doesn't matter how big your matrix is.
Upvotes: 1