farhawa
farhawa

Reputation: 10398

NumPy broadcasting doesn't work

I am trying to broadcast the difference between two vectors. This works for a simple case like this:

In[1] : data = np.array([1,2])
In[2] : centers = np.array([[2,2],[3,3]])

In[3] : data - center

Out[3] : array([[-1,  0],
               [-2, -1]])

But when I try to do the same thing but with larger dimension this won't work

In [4]: data = np.array([[1,2],[3,4],[6,7]])
In [5]: data
Out [5]: array([[1,2],
                [3,4],
                [6,7]])

In [6]: centers = np.array([[2,2],[3,3]])
In [7]: centers
Out [7]: array([[2,2],
                [3,3]])

And I want to perform data - centers so I can get as output:

array([[[-1,0],
        [-2,-1]],
       [[1,2],
        [0,1]],
       [[4,5],
        [3,4]]]

Upvotes: 4

Views: 2157

Answers (1)

Alex Riley
Alex Riley

Reputation: 176730

In this case you need to insert an extra axis into data:

>>> data[:, None] - centers
array([[[-1,  0],
        [-2, -1]],

       [[ 1,  2],
        [ 0,  1]],

       [[ 4,  5],
        [ 3,  4]]])

Originally data.shape is (3, 2) and centers.shape is (2, 2). NumPy isn't able to broadcast arrays with these shapes together because the lengths of the first axes are not compatible (they need to be the same length, or one of them needs to be 1).

Inserting the extra dimension, data[:, None] has shape (3, 1, 2) and then the lengths of the axes align correctly:

(3, 1, 2) 
   (2, 2) 
    #  #
    #  # lengths are equal for this axis
    #
    # 1 is compatible with any length

Upvotes: 6

Related Questions