Timur
Timur

Reputation: 35

why numpy array has size of 112 byte and when I do flatten it, it has 96 byte of memory?

I have a numpy array and I flatten it by np.ravel() and I am confused when i tried to learn the size of the both array

array =np.arange(15).reshape(3,5)

sys.getsizeof(array) 112

sys.getsizeof(array.ravel()) 96

array.size 15

array.ravel().size 15

array = np.arange(30).reshape(5,6)

sys.getsizeof(array) 112

sys.getsizeof(array.ravel()) 96

array.size 30

As seen above two different arrays have the same memory size but each has different amount of element. Why does it happen?

Upvotes: 2

Views: 784

Answers (3)

Joachim Wagner
Joachim Wagner

Reputation: 1003

(1) ravel() usually (see @user2357112 's comments below) returns an object that references the items of the input rather than making a new copy:

>>> a = np.arange(15)
>>> b = a.ravel()
>>> b
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])
>>> a[0]=5
>>> b
array([ 5,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14]) 

sys.getsizeof(array.ravel()) doesn't change no matter how big array is.

(2) Similarly, reshape() doesn't copy the items:

>>> b = a.reshape(3,5)
>>> b
array([[ 5,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])
>>> a[0] = 7
>>> b
array([[ 7,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]]) 

Again, the memory usage stays constant when you increase the size of the input array (and adjust the shape accordingly).

(3) Yes, the other answers are right in that sys.getsizeof() may be missing some of the memory needed by third party objects. However, a simple test suggests that the main part of numpy arrays is accounted for:

>>> sys.getsizeof(np.arange(20))
256
>>> sys.getsizeof(np.arange(40))
416
>>> sys.getsizeof(np.arange(400))
3296

Upvotes: 2

Derte Trdelnik
Derte Trdelnik

Reputation: 2706

sys.getsizeof does not return the memory size of numpy array, as per documentation https://docs.python.org/2/library/sys.html#sys.getsizeof

All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

to find memory size of an numpy array use nbytes or compute it by multiplying the size with itemsize

array.nbytes

array.size * array.itemsize

as Joachim Wagner pointed out, there is slight overhead for each array that can be an issue if you have a lot of np arrays... the sys.getsizeof(array) might give you the size of this overhead but i am not sure

Upvotes: 0

Mohammed Abuiriban
Mohammed Abuiriban

Reputation: 510

As Derte mentioned, sys.getsizeof doesn't say the size of the array. The 96 you got is holding information about the array (if it's 1-Dimensional) and the 112 if it's multi dimensional. Any additional element will increase the size with 8 bytes assuming you are using a dtype=int64.

Upvotes: 0

Related Questions