Reputation: 1275
If a generator is passed to numpy.array
, numpy
does not iterate over the object, nor does it complain, even if copy=True
. Any attempt to index into the array fails, frequently much later and in far away code.
I understand that numpy wants to know the size of the array from the beginning, but this behavior is not good. It should either copy to an intermediate list or raise an exception.
Upvotes: 2
Views: 276
Reputation: 152765
No it's a design decision. If you want to pass in a generator you need to use np.fromiter
:
>>> np.fromiter((i for i in range(10)), float)
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
or convert it to a list
before calling np.array
:
>>> np.array(list(your_iterator))
One reason for this is that numpy needs to iterate over the object several times: once to determine the length/dytpe of the resulting array and once to insert the items. That doesn't play well with generators and iterators which can be iterated only once. Also Generators can be infinite in length (i.e. itertools.count
) and/or use "too much memory".
The rationale probably was: If someone wants to use generators to create an array it's going to use a lot of memory and will be slow - so it should be done intentionally: either by casting it to a list or using np.fromiter
.
Upvotes: 1
Reputation: 42778
This is not a bug, but normal behaviour. If you want to create a array from an iterator, use fromiter
:
>>> import numpy
>>> a = (i*i for i in range(7))
>>> numpy.array(a)
array(<generator object <genexpr> at 0x10dbc1b40>, dtype=object)
>>> numpy.fromiter(a, dtype=float)
array([ 0., 1., 4., 9., 16., 25., 36.])
Upvotes: 3