Reputation: 724
Please refer to below execution -
import sys
_list = [2,55,87]
print(f'1 - Memory used by Python List - {sys.getsizeof(_list)}')
narray = np.array([2,55,87])
size = narray.size * narray.itemsize
print(f'2 - Memory usage of np array using itemsize - {size}')
print(f'3 - Memory usage of np array using getsizeof - {sys.getsizeof(narray)}')
Here is what I get in result
1 - Memory used by Python List - 80
2 - Memory usage of np array using itemsize - 12
3 - Memory usage of np array using getsizeof - 116
One way of calculation suggests numpy array is consuming way too less memory but other says it is consuming more than regular python list? Shouldn't I be using getSizeOf with numpy array. What I am doing wrong here?
Edit - I just checked, an empty python list is consuming 56 bytes whereas an empty np array 104. Is this space being used in pointing to associated built-in methods and attributes?
Upvotes: 8
Views: 4901
Reputation: 356
Because numpy
arrays have shapes, strides, and other member variables that define the data layout it is reasonable that (might) require some extra memory for this!
A list
on the other hand has no specific type, or shape, etc.
Although, if you start appending elements on a list instead of simply writing them as an array, and also go to larger numbers of elements, e.g. 1e7, you will see different behaviour!
Example case:
import numpy as np
import sys
N = int(1e7)
narray = np.zeros(N);
mylist = []
for i in range(N):
mylist.append(narray[i])
print("size of np.array:", sys.getsizeof(narray))
print("size of list :", sys.getsizeof(mylist))
On my (ASUS) Ubuntu 20.04 PC I get:
size of np.array: 80000104
size of list : 81528048
Note that is not only the memory footprint important in an application's efficiency! The data layout is sometimes way more important.
Upvotes: 1
Reputation: 231385
Search on [numpy]getsizeof
produces many potential duplicates.
The basic points are:
a list is a container, and getsizeof
docs warns us that it returns only the size of the container, not the size of the elements that it references. So by itself it is an unreliable measure to the total size of a list (or tuple or dict).
getsizeof
is a fairly good measure of arrays, if you take into account the roughly 100 bytes of "overhead". That overhead will be a big part of a small array, and a minor thing when looking at a large one. nbytes
is the simpler way of judging array memory use.
But for views
, the data-buffer is shared with the base, and doesn't count when using getsizeof
.
object dtype arrays contain references like lists, to the same getsizeof
caution applies.
Overall I think understanding how arrays and lists are stored is more useful way of judging their respective memory use. Focus more on the computational efficiency than memory use. For small stuff, and iterative uses, lists are better. Arrays are best when they are large, and you use array methods to do the calculations.
Upvotes: 4
Reputation: 61910
The calculation using:
size = narray.size * narray.itemsize
does not include the memory consumed by non-element attributes of the array object. This can be verified by the documentation of ndarray.nbytes
:
>>> x = np.zeros((3,5,2), dtype=np.complex128)
>>> x.nbytes
480
>>> np.prod(x.shape) * x.itemsize
480
In the above link, it can be read that ndarray.nbytes
:
Does not include memory consumed by non-element attributes of the array object.
Note that from the code above you can conclude that your calculation excludes non-element attributes given that the value is equal to the one from ndarray.nbytes
.
A list of the non-element attributes can be found in the section Array Attributes, including here for completeness:
ndarray.flags Information about the memory layout of the array.
ndarray.shape Tuple of array dimensions.
ndarray.strides Tuple of bytes to step in each dimension when traversing an array.
ndarray.ndim Number of array dimensions.
ndarray.data Python buffer object pointing to the start of the array’s data.
ndarray.size Number of elements in the array.
ndarray.itemsize Length of one array element in bytes.
ndarray.nbytes Total bytes consumed by the elements of the array.
ndarray.base Base object if memory is from some other object.
With regards to sys.getsizeof
it can be read in the documentation (emphasis mine) that:
Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.
Upvotes: 7