a_guest
a_guest

Reputation: 36249

Element-wise addition of 1D and 2D numpy arrays

Situation

I have objects that have attributes which are represented by numpy arrays:

>> obj = numpy.array([1, 2, 3])

where 1, 2, 3 are the attributes' values.

I'm about to write a few methods that should work equally on both a single object and a group of objects. A group of objects is represented by a 2D numpy array:

>>> group = numpy.array([[11, 21, 31],
...                      [12, 22, 32],
...                      [13, 23, 33]])

where the first digit indicates the object and the second digit indicates the attribute. That is 12 is attribute 2 of object 1 and 21 is attribute 1 of object 2.

Why this way and not transposed? Because I want the array indices to correspond to the attributes. That is object_or_group[0] should yield the first attribute either as a single number or as a numpy array, so it can be used for further computations.

Alright, so when I want to compute the dot product for example this works out of the box:

>>> obj = numpy.array([1, 2, 3])
>>> obj.dot(object_or_group)

What doesn't work is element-wise addition.

Input:

>>> group
array([[1, 2, 3],
       [4, 5, 6]])
>>> obj
array([10, 20])

The resulting array should be the sum of the first element of group and obj and similar for the second element:

>>> result = numpy.array([group[0] + obj[0],
...                       group[1] + obj[1]])
>>> result
array([[11, 12, 13],
       [24, 25, 26]])

However:

>>> group + obj
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (2,3) (2,)

Which makes sense considering numpy's broadcasting rules.

It seems that there is no numpy function which performs an addition (or equivalently the broadcasting) along a specified axis. While I could use

>>> (group.T + obj).T
array([[11, 12, 13],
       [24, 25, 26]])

this feels very cumbersome (and if, instead of a group, I consider a single object this feels wrong indeed). Especially because numpy covered each and every corner case for its usage, I have the feeling that I might have gotten something conceptually wrong here.

To sum it up

Similarly to

>>> obj1
array([1, 2])
>>> obj2
array([10, 20])
>>> obj1 + obj2
array([11, 22])

(which performs an element-wise - or attribute-wise - addition) I want to do the same for groups of objects:

>>> group
array([[1, 2, 3],
       [4, 5, 6]])

while the layout of such a 2D group array must be such that the single objects are listed along the 2nd axis (axis=1) in order to be able to request a certain attribute (or many) via normal indexing: obj[0] and group[0] should both yield the first attribute(s).

Upvotes: 3

Views: 20621

Answers (2)

Dyslexian coder
Dyslexian coder

Reputation: 51

what you want to do seems to work with this simple code !!

>>> m
array([[1, 2, 3],
       [4, 5, 6]])
>>> g = np.array([10,20])
>>> m + g[ : , None]
array([[11, 12, 13],
       [24, 25, 26]])

Upvotes: 4

Mad Physicist
Mad Physicist

Reputation: 114230

You appear to be confused about which dimension of the matrix is an object and which is an attirbute, as evidenced by the changing object size in your examples. In fact, it it the fact that you are swapping dimensions to match that changing size that is throwing you off. You are also using the unfortunate example of a 3x3 group for your dot product, which is further throwing off your explanation.

In the examples below, objects will be three-element vectors, i.e., they will have three attributes each. The example group will have consistently two rows, meaning two objects in it, and three columns, because objects have three attributes.

The first row of the group, group[0], a.k.a. group[0, :], will be the first object in the group. The first column, group[:, 0] will be the first attribute.

Here are a couple of sample objects and groups to illustrate the points that follow:

>>> obj1 = np.array([1, 2, 3])
>>> obj2 = np.array([4, 5, 6])
>>> group1 = np.array([[7, 8, 9],
                      [0, 1, 2]])
>>> group2 = np.array([[3, 4, 5]])

Addition will work out of the box because of broadcasting now:

>>> obj1 + obj2
array([5, 7, 9])
>>> group1 + obj1
array([[ 8, 10, 12],
       [ 1,  3,  5]])

As you can see, corresponding attributes are getting added just fine. You can even add together groups, but only if they are the same size or if one of them only contains a single object:

>>> group1 + group2
array([[10, 12, 14],
       [ 3,  5,  7]])
>>> group1 + group1
array([[14, 16, 18],
       [ 0,  2,  4]])

The same will be true for all the binary elementwise operators: *, -, /, np.bitwise_and, etc.

The only remaining question is how to make dot products not care if they are operating on a matrix or a vector. It just so happens that dot products don't care. Your common dimension is always the number of attributes, so the second operand (the multiplier) needs to be transposed so that the number of columns becomes the number of rows. np.dot(x1, x2.T), or equivalently x1.dot(x2.T) will work correctly whether x1 and x2 are groups or objects:

>>> obj1.dot(obj2.T)
32
>>> obj1.dot(group1.T)
array([50,  8])
>>> group1.dot(obj1.T)
array([50,  8])

You can use either np.atleast_1d or np.atleast_2d to always coerce the result into a particular shape so you don't end up with a scalar like the obj1.dot(obj2.T) case. I would recommend the latter, so you always have a consistent number of dimensions regardless of the inputs:

>>> np.atleast_2d(obj1.dot(obj2.T))
array([[32]])
>>> np.atleast_2d(obj1.dot(group1.T))
array([[50,  8]])

Just keep in mind that the dimensions of the dot product will be the the number of objects in the first operand by the number of objects in the second operand (everything will be treated as a group). The attributes will get multiplied and summed together. Whether or not that has a valid interpretation for your purposes is entirely for you to decide.

UPDATE

The only remaining problem at this point is attribute access. As stated above obj1[0] and group1[0] mean very different things. There are three ways to reconcile this difference, listed in the order that I personally prefer them, with 1 being the most preferable:

  1. Use the Ellipsis indexing object to get the last index instead of the first

    >>> obj1[..., 0]
    array([1])
    >>> group1[..., 0]
    array([7, 0])
    

    This is the most efficient way since it does not make any copies, just does a normal index on the original arrays. As you can see, there will be no difference between the result from a single object (1D array) and a group with only one object in it (2D array).

  2. Make all your objects 2D. As you pointed out yourself, this can be done with a decorator, and/or using np.atleast_2d. Personally, I would prefer having the convenience of using 1D arrays as single objects without having to wrap them in 2D.

  3. Always access attributes via a transpose:

    >>> obj1.T[0]
    1
    >>> group1.T[0]
    array([7, 0])
    

    While this is functionally equivalent to #1, it is clunky and unsightly by comparison, in addition to doing something very different under-the-hood. This approach at the very least creates a new view of the underlying array, and may run the risk of making unnecessary copies in certain cases if the group arrays are not laid out just right. I would not recommend this approach even if it does solve the problem if uniform access.

Upvotes: 3

Related Questions