Reputation: 1891
I am looking for a way to apply a function n items at the time along an axis. E.g.
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8]])
If I apply sum
across the rows 2 items at a time I get:
array([[ 4, 6],
[ 12, 14]])
Which is the sum
of 1st 2 rows and the last 2 rows.
NB: I am dealing with much larger array and I have to apply the function to n items which I can be decided at runtime.
The data extends along different axis. E.g.
array([[... [ 1, 2, ...],
[ 3, 4, ...],
[ 5, 6, ...],
[ 7, 8, ...],
...], ...])
Upvotes: 8
Views: 276
Reputation: 221574
Reshape splitting the first axis into two axes, such that the second split axis is of length n
to have a 3D array and then sum along that split axis, like so -
a.reshape(a.shape[0]//n,n,a.shape[1]).sum(1)
It should be pretty efficient as reshaping just creates a view into input array.
Sample run -
In [55]: a
Out[55]:
array([[2, 8, 0, 0],
[1, 5, 3, 3],
[6, 1, 4, 7],
[0, 4, 0, 7],
[8, 0, 8, 1],
[8, 3, 3, 8]])
In [56]: n = 2 # Sum every two rows
In [57]: a.reshape(a.shape[0]//n,n,a.shape[1]).sum(1)
Out[57]:
array([[ 3, 13, 3, 3],
[ 6, 5, 4, 14],
[16, 3, 11, 9]])
Upvotes: 2
Reputation: 11860
This is a reduction:
numpy.add.reduceat(a, [0,2])
>>> array([[ 4, 6],
[12, 14]], dtype=int32)
As long as by "larger" you mean longer in the "y" axis, you can extend:
a = numpy.array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])
numpy.add.reduceat(a, [0,2,4])
>>> array([[ 4, 6],
[12, 14],
[20, 22]], dtype=int32)
EDIT: actually, this works fine for "larger in both dimensions", too:
a = numpy.arange(24).reshape(6,4)
numpy.add.reduceat(a, [0,2,4])
>>> array([[ 4, 6, 8, 10],
[20, 22, 24, 26],
[36, 38, 40, 42]], dtype=int32)
I will leave it up to you to adapt the indices to your specific case.
Upvotes: 3
Reputation: 214957
How about something like this?
n = 2
# calculate the cumsum along axis 0 and take one row from every n rows
cumarr = arr.cumsum(axis = 0)[(n-1)::n]
# calculate the difference of the resulting numpy array along axis 0
np.vstack((cumarr[0][None, :], np.diff(cumarr, axis=0)))
# array([[ 4, 6],
# [12, 14]])
Upvotes: 1