Nope
Nope

Reputation: 35990

Numpy: Is there an array size limit?

I'm learning to use Numpy and I wanted to see the speed difference in the summation of a list of numbers so I made this code:

np_array = numpy.arange(1000000)
start = time.time()
sum_ = np_array.sum()
print time.time() - start, sum_

>>> 0.0 1783293664

python_list = range(1000000)
start = time.time()
sum_ = sum(python_list)
print time.time() - start, sum_

>>> 0.390000104904 499999500000

The python_list sum is correct.

If I do the same code with the summation to 1000, both print the right answer. Is there an upper limit to the length of the Numpy array or is it with the Numpy sum function?

Thanks for your help

Upvotes: 3

Views: 5743

Answers (3)

Alex Martelli
Alex Martelli

Reputation: 881595

Notice that 499999500000 % 2**32 equals exactly 1783293664 ... i.e., numpy is doing operations modulo 2**32, because that's the type of the numpy.array you've told it to use.

Make np_array = numpy.arange(1000000, dtype=numpy.uint64), for example, and your sum will come out OK (although of course there are still limits, with any finite-size number type).

You can use dtype=numpy.object to tell numpy that the array holds generic Python objects; of course, performance will decay as generality increases.

Upvotes: 6

Joe Koberg
Joe Koberg

Reputation: 26699

Numpy is creating an array of 32-bit unsigned ints. When it sums them, it sums them into a 32-bit value.

if 499999500000L % (2**32) == 1783293664L:
    print "Overflowed a 32-bit integer"

You can explicitly choose the data type at array creation time:

a = numpy.arange(1000000, dtype=numpy.uint64)
a.sum() -> 499999500000

Upvotes: 10

S.Lott
S.Lott

Reputation: 391846

The standard list switched over to doing arithmetic with the long type when numbers got larger than a 32-bit int.

The numpy array did not switch to long, and suffered from integer overflow. The price for speed is smaller range of values allowed.

>>> 499999500000 % 2**32
1783293664L

Upvotes: 10

Related Questions