ujjwal anand
ujjwal anand

Reputation: 23

Is there any maximum size defined for numpy array?

If I try to execute :

np.empty(shape= (108698,200,1000))

In my jupyter notebook, it throws an error

MemoryError                               Traceback (most recent call last)
<ipython-input-35-0aedb09803e9> in <module>()
      1 import numpy as np
      2 #np.empty(shape=(108698-0,200,1000))
----> 3 np.empty(shape= (108698,200,1000))
      4 #np.empty(shape=(end-start,n_words,embedding_size))

But when I try to execute

np.empty(shape= (84323,200,1000)),

It executes without any errors.

So is there any way possible to run

np.empty(shape= (108698,200,1000)) 

without increasing the RAM of my machine?

Upvotes: 0

Views: 4628

Answers (4)

V K Shopov
V K Shopov

Reputation: 101

Well there is not upper limit. We could (roughly) estimate the amount of memory for ndarray:

>>> arr = np.empty(shape= (100,10,1000),dtype='unit8')
>>> hr_size(arr.nbytes)
'1M'

for ndarray with 1 million elements(every element of 'uint8' requires one byte) we need '976.6K' of memory.

for ndarray with shape= (84323,200,1000) and dtype='uint8'

>>> hr_size(84323*200*1000)
'15.7G'

we need more than 15G

and finally for ndarray with shape= (108698,200,1000) and dtype='uint8'

>>> hr_size(108698*200*1000)
'20.2G'

we need more than 20G.

If dtype is 'int64' then estimated amount of memory should be increased eight times.

Upvotes: 0

Metaphox
Metaphox

Reputation: 2094

There is no upper limit defined for shape, but the whole size of the array is limited to numpy.intp, which is normally int32 or int64.

You can either use sparse matrix from SciPi or limit the dtype of your large (108698,200,1000) array to int8, which should work.

Upvotes: 0

MB-F
MB-F

Reputation: 23637

You can work with arrays that do not fit into memory by using memory mapped files. Numpy has facilities for this: numpy.memmap.

E.g:

x = np.memmap('test.bin', mode='w+', shape=(108698,200,1000))

However, on 32 bit Python the files are still limited to 2GB.

Upvotes: 1

Lucas Hendren
Lucas Hendren

Reputation: 2826

No. While it depends on what you're running if you have reached you the maximum allocated memory you can't just create more. For example, if you're running 64-bit numpy, at 8 bytes per entry, that would be 174 GB in all which would take up far too much space. If you know the data entries and are willing to use something besides numpy you could look into sparse arrays. Sparse arrays store only the nonzero elements and their position indices which could potentially save you space.

Upvotes: 2

Related Questions