Eduardo
Eduardo

Reputation: 1275

Is it a bug that generators don't work when passed to "numpy.array"?

If a generator is passed to numpy.array, numpy does not iterate over the object, nor does it complain, even if copy=True. Any attempt to index into the array fails, frequently much later and in far away code.

I understand that numpy wants to know the size of the array from the beginning, but this behavior is not good. It should either copy to an intermediate list or raise an exception.

Upvotes: 2

Views: 276

Answers (2)

MSeifert
MSeifert

Reputation: 152765

No it's a design decision. If you want to pass in a generator you need to use np.fromiter:

>>> np.fromiter((i for i in range(10)), float)
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])

or convert it to a list before calling np.array:

>>> np.array(list(your_iterator))

One reason for this is that numpy needs to iterate over the object several times: once to determine the length/dytpe of the resulting array and once to insert the items. That doesn't play well with generators and iterators which can be iterated only once. Also Generators can be infinite in length (i.e. itertools.count) and/or use "too much memory".

The rationale probably was: If someone wants to use generators to create an array it's going to use a lot of memory and will be slow - so it should be done intentionally: either by casting it to a list or using np.fromiter.

Upvotes: 1

Daniel
Daniel

Reputation: 42778

This is not a bug, but normal behaviour. If you want to create a array from an iterator, use fromiter:

>>> import numpy
>>> a = (i*i for i in range(7))
>>> numpy.array(a)
array(<generator object <genexpr> at 0x10dbc1b40>, dtype=object)
>>> numpy.fromiter(a, dtype=float)
array([  0.,   1.,   4.,   9.,  16.,  25.,  36.])

Upvotes: 3

Related Questions