bigbounty
bigbounty

Reputation: 17368

Memory Size of list python

I was experimenting something with lists and I came around self-referencing lists. I searched through SO and got some of my basic questions about it answered. But when I tried to get the memory size of self referencing lists of different lengths, I found an interesting pattern

Code to reproduce:

import sys

memory_size = {}

for length in range(50):
    lst = []
    for length_loop in range(length):
        lst.append(lst)
    memory_size[length] = sys.getsizeof(lst)

Value of memory_size:

{0: 64, 1: 96, 2: 96, 3: 96, 4: 96, 5: 128, 6: 128, 7: 128, 8: 128, 9: 192, 10: 192, 11: 192, 12: 192, 13: 192, 14: 192, 15: 192, 16: 192, 17: 264, 18: 264, 19: 264, 20: 264, 21: 264, 22: 264, 23: 264, 24: 264, 25: 264, 26: 344, 27: 344, 28: 344, 29: 344, 30: 344, 31: 344, 32: 344, 33: 344, 34: 344, 35: 344, 36: 432, 37: 432, 38: 432, 39: 432, 40: 432, 41: 432, 42: 432, 43: 432, 44: 432, 45: 432, 46: 432, 47: 528, 48: 528, 49: 528}

On plotting the above data points

enter image description here

Python 3.7.3 (default, Mar 27 2019, 16:54:48)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.5.0 -- An enhanced Interactive Python. Type '?' for help.

Why does the memory size of self-referencing list remain constant for certain range of lengths and increase after certain length?

And also the increase the increase in memory size is different

Upvotes: 4

Views: 2838

Answers (2)

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95917

Lists always display this pattern if they are grown with append.

A key point to understand, sys.getsizeof doesn't account for the objects referenced in the list, only the size of the list object itself. Now, Python list objects are implemented as array lists underneath the hood, so essentially there is a PyObject header (like, 16 byte overhead or some such), then a primitive array of PyObject pointers (which is why they can be heterogenous, and reference themselves).

This underlying array is overallocated, and it's re-sized in such a way to guarantee amortized constant-time .append operations. Stated another way, Python list objects have amortized constant-time .append, so doing something like for x in range(N): my_list.append(0) is a linear time operation, because the underlying buffer is not re-allocated on each iteration.

Look, you see the same pattern with any object, like None:

In [24]: import sys
    ...:
    ...: memory_size = {}
    ...:
    ...: for length in range(50):
    ...:     lst = []
    ...:     for length_loop in range(length):
    ...:         lst.append(None)
    ...:     memory_size[length] = sys.getsizeof(lst)
    ...:

In [25]: memory_size
Out[25]:
{0: 72,
 1: 104,
 2: 104,
 3: 104,
 4: 104,
 5: 136,
 6: 136,
 7: 136,
 8: 136,
 9: 200,
 10: 200,
 11: 200,
 12: 200,
 13: 200,
 14: 200,
 15: 200,
 16: 200,
 17: 272,
 18: 272,
 19: 272,
 20: 272,
 21: 272,
 22: 272,
 23: 272,
 24: 272,
 25: 272,
 26: 352,
 27: 352,
 28: 352,
 29: 352,
 30: 352,
 31: 352,
 32: 352,
 33: 352,
 34: 352,
 35: 352,
 36: 440,
 37: 440,
 38: 440,
 39: 440,
 40: 440,
 41: 440,
 42: 440,
 43: 440,
 44: 440,
 45: 440,
 46: 440,
 47: 536,
 48: 536,
 49: 536}

And just to convince you, here is your self-referening list:

In [26]: import sys
    ...:
    ...: memory_size = {}
    ...:
    ...: for length in range(50):
    ...:     lst = []
    ...:     for length_loop in range(length):
    ...:         lst.append(lst)
    ...:     memory_size[length] = sys.getsizeof(lst)
    ...:

In [27]: memory_size
Out[27]:
{0: 72,
 1: 104,
 2: 104,
 3: 104,
 4: 104,
 5: 136,
 6: 136,
 7: 136,
 8: 136,
 9: 200,
 10: 200,
 11: 200,
 12: 200,
 13: 200,
 14: 200,
 15: 200,
 16: 200,
 17: 272,
 18: 272,
 19: 272,
 20: 272,
 21: 272,
 22: 272,
 23: 272,
 24: 272,
 25: 272,
 26: 352,
 27: 352,
 28: 352,
 29: 352,
 30: 352,
 31: 352,
 32: 352,
 33: 352,
 34: 352,
 35: 352,
 36: 440,
 37: 440,
 38: 440,
 39: 440,
 40: 440,
 41: 440,
 42: 440,
 43: 440,
 44: 440,
 45: 440,
 46: 440,
 47: 536,
 48: 536,
 49: 536}

The discrepencies in the individual size come down to things like Python version and system architecture (on a 32-bit system, pointers are 4 bytes instead of 8, for example, and different Python versions are free to change implementation details like the size of an empty list). Note, the above was run on Python 3.7, if I use another environemnt:

(base) juanarrivillaga@173-11-109-137-SFBA ~ % python -c "import sys; print(f'{sys.version}\nEmpty List Size: {sys.getsizeof([])}')"
3.8.1 (default, Jan  8 2020, 16:15:59)
[Clang 4.0.1 (tags/RELEASE_401/final)]
Empty List Size: 56

Upvotes: 7

Ankan Das
Ankan Das

Reputation: 298

The answer might be lying somewhere in this. While obviously, the differences in the size will depend upon your system architecture and python installation, the growth size can be explained through the implementation of lists in Python.

https://github.com/python/cpython/blob/master/Objects/listobject.c

Upvotes: 0

Related Questions