Puzzled by odd NumPy error when creating array

Question

With

foo_ok = [(30, 784), (10, 30)]
foo_bad = [(10, 784), (10, 10)]

why does

np.array([np.zeros(foo_ok[0]),np.zeros(foo_ok[1])])

work while

np.array([np.zeros(foo_bad[0]),np.zeros(foo_bad[1])])

results in

ValueError: could not broadcast input array from shape (10,784) into shape (10)

Basically I need things that work with the form foo = [(X, Z), (Y, X)] where it might be the case that Y==X; but having Y==X causes things to fail.

Imanol Luengo · Accepted Answer

Edited the answer according to the edited question.

Basically, the problem relies when the first axis matches on the 2 arrays. Find bellow a replicable example:

foo_ok = [(30, 784), (10, 30)]
foo_ok2 = [(30, 784), (30, 784)]
foo_bad = [(10, 784), (10, 10)]

If we construct the first 2 arrays:

a = np.array([np.zeros(foo_ok[0]),np.zeros(foo_ok[1])])
b = np.array([np.zeros(foo_ok2[0]),np.zeros(foo_ok2[1])])

c = np.array([np.zeros(foo_bad[0]),np.zeros(foo_bad[1])]) # ERROR

we can see that the resulting arrays are not the same:

>>> print a.shape, a.dtype, a[0].shape, a[1].shape
(2,), dtype('O'), (30, 784), (10, 30)

>>> print b.shape, b.dtype, b[0].shape, b[1].shape
(2, 30, 784), dtype('float64'), (30, 784), (30, 784)

Here foo_ok2[0] and foo_ok2[1] have the same values, thus, it will create 2 arrays of the same shape. Numpy is smart enough to handle array concatenations when 2 arrays with the same dimensions come, and the resulting b array is a concatenation of shape (2, 30, 784). However, the resulting array a is just an array of type object with 2 elements. Each of the elements of the list is a different array (like if it was a raw python list).

Numpy is not optimized to deal with object arrays, and thus, whenever possible it tries to cast arrays to numerical data types.

That is what is happening then the first dimension of the 2 arrays matches in c. Numpy expects all the dimensions to match, and thus, throws a I cannot concatenate this exception.

Although I would still encourage not using numpy arrays with object types, there is a dirty way you can create one even when the first axis matches while the arrays have different shapes:

>>> c = np.array([np.zeros(foo_bad[0]), None])
>>> c[1] = np.zeros(foo_bad[1])

>>> print c.shape, c.dtype, c[0].shape, c[1].shape
(2,), dtype('O'), (10, 784), (10, 10)

And another version of it (closely related to your syntax):

>>> c = np.empty((2,), dtype=np.object)
>>> c[:] = [np.zeros(foo_bad[0]), np.zeros(foo_bad[1])]

>>> print c.shape, c.dtype, c[0].shape, c[1].shape
(2,), dtype('O'), (10, 784), (10, 10)

Puzzled by odd NumPy error when creating array

Answers (1)

Related Questions