Reputation: 17368
I was experimenting something with lists and I came around self-referencing lists. I searched through SO and got some of my basic questions about it answered. But when I tried to get the memory size of self referencing lists of different lengths, I found an interesting pattern
Code to reproduce:
import sys
memory_size = {}
for length in range(50):
lst = []
for length_loop in range(length):
lst.append(lst)
memory_size[length] = sys.getsizeof(lst)
Value of memory_size:
{0: 64, 1: 96, 2: 96, 3: 96, 4: 96, 5: 128, 6: 128, 7: 128, 8: 128, 9: 192, 10: 192, 11: 192, 12: 192, 13: 192, 14: 192, 15: 192, 16: 192, 17: 264, 18: 264, 19: 264, 20: 264, 21: 264, 22: 264, 23: 264, 24: 264, 25: 264, 26: 344, 27: 344, 28: 344, 29: 344, 30: 344, 31: 344, 32: 344, 33: 344, 34: 344, 35: 344, 36: 432, 37: 432, 38: 432, 39: 432, 40: 432, 41: 432, 42: 432, 43: 432, 44: 432, 45: 432, 46: 432, 47: 528, 48: 528, 49: 528}
On plotting the above data points
Python 3.7.3 (default, Mar 27 2019, 16:54:48)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.5.0 -- An enhanced Interactive Python. Type '?' for help.
Why does the memory size of self-referencing list remain constant for certain range of lengths and increase after certain length?
And also the increase the increase in memory size is different
Upvotes: 4
Views: 2838
Reputation: 95917
Lists always display this pattern if they are grown with append.
A key point to understand, sys.getsizeof
doesn't account for the objects referenced in the list, only the size of the list object itself. Now, Python list
objects are implemented as array lists underneath the hood, so essentially there is a PyObject header (like, 16 byte overhead or some such), then a primitive array of PyObject pointers (which is why they can be heterogenous, and reference themselves).
This underlying array is overallocated, and it's re-sized in such a way to guarantee amortized constant-time .append
operations.
Stated another way, Python list
objects have amortized constant-time .append
, so doing something like for x in range(N): my_list.append(0)
is a linear time operation, because the underlying buffer is not re-allocated on each iteration.
Look, you see the same pattern with any object, like None
:
In [24]: import sys
...:
...: memory_size = {}
...:
...: for length in range(50):
...: lst = []
...: for length_loop in range(length):
...: lst.append(None)
...: memory_size[length] = sys.getsizeof(lst)
...:
In [25]: memory_size
Out[25]:
{0: 72,
1: 104,
2: 104,
3: 104,
4: 104,
5: 136,
6: 136,
7: 136,
8: 136,
9: 200,
10: 200,
11: 200,
12: 200,
13: 200,
14: 200,
15: 200,
16: 200,
17: 272,
18: 272,
19: 272,
20: 272,
21: 272,
22: 272,
23: 272,
24: 272,
25: 272,
26: 352,
27: 352,
28: 352,
29: 352,
30: 352,
31: 352,
32: 352,
33: 352,
34: 352,
35: 352,
36: 440,
37: 440,
38: 440,
39: 440,
40: 440,
41: 440,
42: 440,
43: 440,
44: 440,
45: 440,
46: 440,
47: 536,
48: 536,
49: 536}
And just to convince you, here is your self-referening list:
In [26]: import sys
...:
...: memory_size = {}
...:
...: for length in range(50):
...: lst = []
...: for length_loop in range(length):
...: lst.append(lst)
...: memory_size[length] = sys.getsizeof(lst)
...:
In [27]: memory_size
Out[27]:
{0: 72,
1: 104,
2: 104,
3: 104,
4: 104,
5: 136,
6: 136,
7: 136,
8: 136,
9: 200,
10: 200,
11: 200,
12: 200,
13: 200,
14: 200,
15: 200,
16: 200,
17: 272,
18: 272,
19: 272,
20: 272,
21: 272,
22: 272,
23: 272,
24: 272,
25: 272,
26: 352,
27: 352,
28: 352,
29: 352,
30: 352,
31: 352,
32: 352,
33: 352,
34: 352,
35: 352,
36: 440,
37: 440,
38: 440,
39: 440,
40: 440,
41: 440,
42: 440,
43: 440,
44: 440,
45: 440,
46: 440,
47: 536,
48: 536,
49: 536}
The discrepencies in the individual size come down to things like Python version and system architecture (on a 32-bit system, pointers are 4 bytes instead of 8, for example, and different Python versions are free to change implementation details like the size of an empty list). Note, the above was run on Python 3.7, if I use another environemnt:
(base) juanarrivillaga@173-11-109-137-SFBA ~ % python -c "import sys; print(f'{sys.version}\nEmpty List Size: {sys.getsizeof([])}')"
3.8.1 (default, Jan 8 2020, 16:15:59)
[Clang 4.0.1 (tags/RELEASE_401/final)]
Empty List Size: 56
Upvotes: 7
Reputation: 298
The answer might be lying somewhere in this. While obviously, the differences in the size will depend upon your system architecture and python installation, the growth size can be explained through the implementation of lists in Python.
https://github.com/python/cpython/blob/master/Objects/listobject.c
Upvotes: 0