Kartik
Kartik

Reputation: 91

Why is there a difference in sys.getsizeof for two differently created but equal lists?

I defined two lists as follows:

import sys
lst = list(range(1, 10, 1))
llst = ([1, 2, 3, 4, 5, 6, 7, 8, 9])
print(sys.getsizeof(llst), sys.getsizeof(lst))

This is the output I see:

152 128

Why is there a difference in size of the two lists when they seem to have the same number of elements?

I'm using Python 3.10.5 on Windows 11.

Upvotes: 1

Views: 123

Answers (1)

S.B
S.B

Reputation: 16476

The difference comes from "over-allocation". From source code:

    /* This over-allocates proportional to the list size, making room
     * for additional growth.  The over-allocation is mild, but is
     * enough to give linear-time amortized behavior over a long
     * sequence of appends() in the presence of a poorly-performing
     * system realloc().
     * Add padding to make the allocated size multiple of 4.
     * The growth pattern is:  0, 4, 8, 16, 24, 32, 40, 52, 64, 76, ...
     * Note: new_allocated won't overflow because the largest possible value
     *       is PY_SSIZE_T_MAX * (9 / 8) + 6 which always fits in a size_t.

But the exact behavior is implementation detail!

In Python 3.10:

import sys

lst1 = list(range(1, 10))
lst2 = [item for item in range(1, 10)]
lst3 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
lst4 = []
for i in range(1, 10):
    lst4.append(i)

print(sys.getsizeof(lst1)) # 136
print(sys.getsizeof(lst2)) # 184
print(sys.getsizeof(lst3)) # 136
print(sys.getsizeof(lst4)) # 184

In Python 3.5.1:

import sys

lst1 = list(range(1, 10))
lst2 = [item for item in range(1, 10)]
lst3 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
lst4 = []
for i in range(1, 10):
    lst4.append(i)

print(sys.getsizeof(lst1)) # 192
print(sys.getsizeof(lst2)) # 192
print(sys.getsizeof(lst3)) # 136
print(sys.getsizeof(lst4)) # 192

I know surely that this happens when .append() is called. (same thing with list-comprehension). That's why you see in both versions, lst2 and lst4 have the largest size.

Seems like in Python 3.10, for lst1, interpreter says OK from range object, I know (__len__ and __length_hint__) it requires 10 elements so I build a list of 10 elements. No need for over-allocation.

Upvotes: 7

Related Questions