Reputation: 35
I have a numpy array and I flatten it by np.ravel() and I am confused when i tried to learn the size of the both array
array =np.arange(15).reshape(3,5)
sys.getsizeof(array) 112
sys.getsizeof(array.ravel()) 96
array.size 15
array.ravel().size 15
array = np.arange(30).reshape(5,6)
sys.getsizeof(array) 112
sys.getsizeof(array.ravel()) 96
array.size 30
As seen above two different arrays have the same memory size but each has different amount of element. Why does it happen?
Upvotes: 2
Views: 784
Reputation: 1003
(1) ravel()
usually (see @user2357112 's comments below) returns an object that references the items of the input rather than making a new copy:
>>> a = np.arange(15)
>>> b = a.ravel()
>>> b
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> a[0]=5
>>> b
array([ 5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
sys.getsizeof(array.ravel())
doesn't change no matter how big array
is.
(2) Similarly, reshape()
doesn't copy the items:
>>> b = a.reshape(3,5)
>>> b
array([[ 5, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
>>> a[0] = 7
>>> b
array([[ 7, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
Again, the memory usage stays constant when you increase the size of the input array (and adjust the shape accordingly).
(3) Yes, the other answers are right in that sys.getsizeof()
may be missing some of the memory needed by third party objects. However, a simple test suggests that the main part of numpy arrays is accounted for:
>>> sys.getsizeof(np.arange(20))
256
>>> sys.getsizeof(np.arange(40))
416
>>> sys.getsizeof(np.arange(400))
3296
Upvotes: 2
Reputation: 2706
sys.getsizeof
does not return the memory size of numpy array, as per documentation https://docs.python.org/2/library/sys.html#sys.getsizeof
All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.
to find memory size of an numpy array use nbytes
or compute it by multiplying the size with itemsize
array.nbytes
array.size * array.itemsize
as Joachim Wagner pointed out, there is slight overhead for each array that can be an issue if you have a lot of np arrays... the sys.getsizeof(array)
might give you the size of this overhead but i am not sure
Upvotes: 0
Reputation: 510
As Derte mentioned, sys.getsizeof
doesn't say the size of the array. The 96
you got is holding information about the array (if it's 1-Dimensional) and the 112
if it's multi dimensional. Any additional element will increase the size with 8 bytes assuming you are using a dtype=int64
.
Upvotes: 0