Reputation: 742
I have some array A
, and the corresponding elements of the array bins
contain each row's bin assignment. I want to construct an array S
, such that
S[0, :] = (A[(bins == 0), :]).sum(axis=0)
This is rather easy to do with np.stack
and list comprehensions, but it seems overly complicated and not terribly readable. Is there a more general way to sum (or even apply some general function to) slices of arrays with bin assignments? scipy.stats.binned_statistic
is along the right lines, but requires that bin assignments and values to compute the functions on are the same shape (since I am using slices, this is not the case).
For example, if
A = np.array([[1., 2., 3., 4.],
[2., 3., 4., 5.],
[9., 8., 7., 6.],
[8., 7., 6., 5.]])
and
bins = np.array([0, 1, 0, 2])
then it should result in
S = np.array([[10., 10., 10., 10.],
[2., 3., 4., 5. ],
[8., 7., 6., 5. ]])
Upvotes: 2
Views: 297
Reputation: 221624
Here's an approach with matrix-multiplication
using np.dot
-
(bins == np.arange(bins.max()+1)[:,None]).dot(A)
Sample run -
In [40]: A = np.array([[1., 2., 3., 4.],
...: [2., 3., 4., 5.],
...: [9., 8., 7., 6.],
...: [8., 7., 6., 5.]])
In [41]: bins = np.array([0, 1, 0, 2])
In [42]: (bins == np.arange(bins.max()+1)[:,None]).dot(A)
Out[42]:
array([[ 10., 10., 10., 10.],
[ 2., 3., 4., 5.],
[ 8., 7., 6., 5.]])
Performance boost
A more efficient way to create the mask (bins == np.arange(bins.max()+1)[:,None])
, would be like so -
mask = np.zeros((bins.max()+1, len(bins)), dtype=bool)
mask[bins, np.arange(len(bins))] = 1
Upvotes: 2
Reputation: 215047
You can use np.add.reduceat
:
import numpy as np
# index to sort the bins
sort_index = bins.argsort()
# indices where the array needs to be split at
indices = np.concatenate(([0], np.where(np.diff(bins[sort_index]))[0] + 1))
# sum values where the bins are the same
np.add.reduceat(A[sort_index], indices, axis=0)
# array([[ 10., 10., 10., 10.],
# [ 2., 3., 4., 5.],
# [ 8., 7., 6., 5.]])
Upvotes: 2