Reputation: 4523
I am trying to determine the sizes of different data types in python, and wrote the following code. Each line's outputs are shown as comments beside them.
import numpy as np
import sys
print(sys.version) # 3.6.7
print(np.version.version) # 1.15.4
i = 2
print(type(i)) # <class 'int'>
print(sys.getsizeof(i)) # 28
j = np.int64(3)
print(type(j)) # <class 'numpy.int64'>
print(sys.getsizeof(j)) # 32
a = np.array([[1,2],[3,4],[5,6]])
print(type(a)) # <class 'numpy.ndarray'>
print(sys.getsizeof(a)) # 160
print(type(a[0][0])) # <class 'numpy.int64'>
print(sys.getsizeof(a[0][0])) # 32
a = np.array([[1,2],[3,4],[5,6],[7,8]])
print(sys.getsizeof(a)) # 176
From the above outputs, size of a 6 element array is 160 and size of a 8 element array is 176, so can I conclude that size of each element in the array is 8 bytes and the size of the array's header (constant) is 112 bytes? Is the size of each element constant or depends on its value (large or small)?
Also, when I am printing the size of a[0][0] why am I getting 32 and not 8? What exactly is the maths behind python and numpy integers and arrays?
Upvotes: 2
Views: 1793
Reputation: 4392
Actually "sys.getsizeof" is not a suitable function to determine the size in numpy. It just works out for built-in python objects.
Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.
getsizeof() calls the object’s sizeof method and adds an additional garbage collector overhead if the object is managed by the garbage collector.
For numpy.ndarray use nbytes which is size * itemsize
a = np.array([[3,4],[8,0],[9,8],[7,0]])
a.size # 8
a.itemsize # 8
a.nbytes # 64
a = np.array([[3,4],[8,0],[9,8],[7,0]],dtype=np.int32)
a.size # 8
a.itemsize # 4
a.nbytes # 32
Upvotes: 2