Reputation: 1808
So I have the following numpy arrays:
c = array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
X = array([[10, 15, 20, 5],
[ 1, 2, 6, 23]])
y = array([1, 1])
I am trying to add each 1x4
row in the X
array to one of the columns in c
. The y
array specifies which column. The above example, means that we are adding both rows in the X
array to column 1
of c. That is, we should expect the result of:
c = array([[ 1, 2+10+1, 3], = array([[ 1, 13, 3],
[ 4, 5+15+2, 6], [ 4, 22, 6],
[ 7, 8+20+6, 9], [ 7, 34, 9],
[10, 11+5+23, 12]]) [10, 39, 12]])
Does anyone know how I can do this without any loops? I tried c[:,y] += X
but it seems like this only adds the second row of X
to column 1
of c
once. With that being said, it should be noted that y
does not necessarily have to be [1,1]
, it can also be [0,1]
. In this case, we would add the first row of X
to column 0
of c
and the second row of X
to column 1
of c
.
Upvotes: 2
Views: 1278
Reputation: 13733
This is the solution I came up with:
def my_func(c, X, y):
cc = np.zeros((len(y), c.shape[0], c.shape[1]))
cc[range(len(y)), :, y] = X
return c + np.sum(cc, 0)
The following interactive session demonstrates how it works:
>>> my_func(c, X, y)
array([[ 1., 13., 3.],
[ 4., 22., 6.],
[ 7., 34., 9.],
[ 10., 39., 12.]])
>>> y2 = np.array([0, 2])
>>> my_func(c, X, y2)
array([[ 11., 2., 4.],
[ 19., 5., 8.],
[ 27., 8., 15.],
[ 15., 11., 35.]])
Upvotes: 0
Reputation: 231385
My first thought when I saw your desired calculation, was to just sum the 2 rows of X
, and add that to the 2nd column of c
:
In [636]: c = array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
In [637]: c[:,1]+=X.sum(axis=0)
In [638]: c
Out[638]:
array([[ 1, 13, 3],
[ 4, 22, 6],
[ 7, 34, 9],
[10, 39, 12]])
But if we want to work from a general index like y
, we need a special bufferless
operation - that is if there are duplicates in y
:
In [639]: c = array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
In [641]: np.add.at(c,(slice(None),y),X.T)
In [642]: c
Out[642]:
array([[ 1, 13, 3],
[ 4, 22, 6],
[ 7, 34, 9],
[10, 39, 12]])
You need to look up .at
in the numpy docs.
in Ipython add.at?
shows me the doc that includes:
Performs unbuffered in place operation on operand 'a' for elements specified by 'indices'. For addition ufunc, this method is equivalent to
a[indices] += b
, except that results are accumulated for elements that are indexed more than once. For example,a[[0,0]] += 1
will only increment the first element once because of buffering, whereasadd.at(a, [0,0], 1)
will increment the first element twice.
With a different y
it still works
In [645]: np.add.at(c,(slice(None),[0,2]),X.T)
In [646]: c
Out[646]:
array([[11, 2, 4],
[19, 5, 8],
[27, 8, 15],
[15, 11, 35]])
Upvotes: 3
Reputation: 632
Firstly, your code seems to work in general if you transpose X
. For example:
c = array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
X = array([[10, 15, 20, 5],
[ 1, 2, 6, 23]]).transpose()
y = array([1, 2])
c[:,y] += X
print c
#OUTPUT:
#[[ 1 12 4]
# [ 4 20 8]
# [ 7 28 15]
# [10 16 35]]
However, it doesn't work when there are any duplicate columns in y
, like in your specific example. I believe this is because c[:, [1,1]]
will generate an array with two columns, each having the slice c[:, 1]
. Both of these slices point to the same part of c, and so when the addition happens on each, they are both read, then the corresponding part of X
is added to each, then they are written back, meaning the last one to be written back is the final value. I don't believe numpy will let you vectorize an operation like this because it fundamentally can't be. This requires editing one column at a time, saving back it's value, and then editing it again later.
You might have to settle for no duplicates, or otherwise implement something like an accumulator.
Upvotes: 0