What causes the (big) size of python lists?

Question

I was messing around with sys.getsizeof and was a bit surprised when I got to lists and arrays:

>>> from sys import getsizeof as sizeof
>>> list_ = range(10**6)
>>> sizeof(list_)
8000072

Compared to an array:

>>> from array import array
>>> array_ = array('i', range(10**6))
>>> sizeof(array_)
56

Turns out the size of a list of integers tends to 1/3 of the size of all its elements, so it can't be holding them:

>>> sizeof(10**8)
24
>>> for i in xrange(0,9):
...  round(sizeof(range(10**i)) / ((10**i) * 24.0), 4), "10**%s elements" % (i)
... 
(3.3333, '10**0 elements')
(0.6333, '10**1 elements')
(0.3633, '10**2 elements')
(0.3363, '10**3 elements')
(0.3336, '10**4 elements')
(0.3334, '10**5 elements')
(0.3333, '10**6 elements')
(0.3333, '10**7 elements')
(0.3333, '10**8 elements')

What causes this behavior, both of list being big but not as big as all its elements and array being so small?

Martijn Pieters · Accepted Answer

You've encountered an issue with array objects not reflecting their size correctly.

Up until Python 2.7.3 the object's .__sizeof__() method did not reflect the size accurately. On Python 2.7.4 and newer, as well as any other new Python 3 release made after August 2012, a bug fix was included that added the size.

On Python 2.7.5 I see:

>>> sys.getsizeof(array_)
4000056L

which conforms with the 56 bytes of size my 64-bit system requires for the base object, plus 4 bytes per signed integer contained.

On Python 2.7.3, I see:

>>> sys.getsizeof(array_)
56L

Python list objects on my system use 8 bytes per reference, so their size is naturally almost twice as big.

What causes the (big) size of python lists?

Answers (2)

Related Questions