Numpy: Error in broadcasting without using np.reshape?

Question

I just found out that despite having an array that python shows as (5000,1) and (5000,3072), which should be technically possible to perform broadcasting to give a (5000,3072), it sometimes just doesn't work if we don't reshape them explicitly to these shapes?

I have a set of data which I simply extract them and run, and initially I just accepted whatever shape it was presented with but it keeps giving me the error in broadcasting. I would just like to find out if this error I described is true, or is there some function or things I should do but I didn't which result in an error in broadcasting?

I have alternatively tested out with random matrices of np.random.rand(5000,3072) and np.random.rand(5000,1) subtracting both matrices work, so broadcasting works in this case.

unutbu · Accepted Answer

The arrays being broadcasted have shapes (5000, 3072) and (5000,). The first array is 2-dimensional, but the second is 1-dimensional. NumPy broadcasting always adds new axes (dimensions) on the left. So while (5000,) can broadcast to (N, 5000) for any N, it can not broadcast (without explicit help) to (5000, 3072).

To fix the problem, a new axis must be added to the second array on the right.

dist_vector = dist_vector[:, np.newaxis]

For example,

In [16]: dist_vector = np.arange(5000)

In [17]: dist_vector.shape
Out[17]: (5000,)

In [18]: dist_vector = dist_vector[:, np.newaxis]

In [19]: dist_vector.shape
Out[19]: (5000, 1)

Note that the shape (5000,) is not the same shape as (5000, 1). The first shape indicates the array is 1-dimensional since the tuple (5000,) contains only 1 value. (5000, 1) indicates the array is 2-dimensional.

In the examples in the documentation notice that the shape of the arrays are right-justified. This is what allows for short shapes -- shapes of arrays with fewer dimensions -- to get new axes added on the left.

Numpy: Error in broadcasting without using np.reshape?

Answers (1)

Related Questions