Python - tuples and memory management

Question

While answering a Python-related question I did some experimenting and found a tuple-related thing that I can't figure out: I have difficulties understanding why an empty tuple takes more space then a tuple with a single element in it (according to sys.getsizeof()). The code below has been executed using the upstream version of Python 2.7.9 on a 64 bit Debian Jessie systems (also tested with Python 3.4.2 where values are slightly different but the overall behaviour that I'm talking about is still there):

>> sys.getsizeof(())
56
>> sys.getsizeof((1))
24
>> sys.getsizeof((1,2))
72

As you can see there is quite a difference (factor 2.3 to be more precise) in an empty vs. a single-item tuple. Any ideas what is happening here? Since tuples are recursive data structures I assume that the values that sys.getsizeof() returns is the amount of memory the tuple object itself is taking plus the references to the objects it contains (if tuples store elements as references and not as values - that I don't know).My initial thought was that like dictionaries tuples are created in memory with a certain default size that can hold a couple of elements. I forgot the exact number of elements that an empty dictionary has reserved memory for but here is what I mean with a small example:

>> sys.getsizeof({})
280
>> sys.getsizeof({"a":0})
280
>> sys.getsizeof({"a":0, "b":1})
280

Looking a the tuples however doesn't seem to show the same behaviour since a tuple shrinks after the first element is added to a smaller size and then grows (as expected) with each element added to it. In addition to that lists don't seem to suffer from the same behaviour:

>> sys.getsizeof([])
72
>> sys.getsizeof([1])
80
>> sys.getsizeof([1,2])
88

Empty list memory-wise is smaller then a list with 1 or more elements - perfectly normal.

My second thought was that a single-element tuple is converted somehow into the single object it contains and all that is somehow wrapped so that it seems that it is actually a list (that is why len() works). Example:

>> sys.getsizeof((1))
24
>> sys.getsizeof(1)
24

This seems possible however I'm not convinced that it's actually happening.

Kevin · Accepted Answer

You are not creating single-item tuples. Any expression can be wrapped in parentheses without changing its meaning, so (1) is just the same as 1. To make a single element tuple, write (1,). When you do this, the memory consumption should become more reasonable.

But how can an integer require 24 bytes? Aren't integers usually 8 bytes at most?

In Python 3, all integers are arbitrary precision, and even in Python 2, all objects including integers have a certain amount of overhead for reference counting, type information, and other metadata. 24 bytes is actually quite reasonable for a full-blown Python object.

(Technically, small integers are interned, so the reference count is arguably unnecessary in this particular case, but removing it would complicate other parts of the interpreter for no significant benefit.)

While I'm answering, I'd also like to clear a few things up:

(if tuples store elements as references and not as values - that I don't know)

Python, generally speaking, does not allow Python objects to actually contain one another. In the C source, you will see a lot of PyObject* variables, but no PyObject variables. Python objects live on the heap, totally independent of one another. In other words, yes, the tuple is storing references, not the actual objects.

My initial thought was that like dictionaries tuples are created in memory with a certain default size that can hold a couple of elements.

There is no reason to do this. Tuples have a fixed size. Officially, they are immutable and do not change after creation. Unofficially, they can be altered at the C level, but these alterations are not supposed to be visible to Python code; you modify the tuple once after you've created it but before you've used it for anything, and thereafter you don't modify it again. Furthermore, these alterations do not actually change the size, they just fill in the array slots which already exist.

Because of this immutability, there is no reason for tuples to store more elements than they actually have. They don't need to support fast appends or other size-changing operations.

Python - tuples and memory management

Answers (2)

Related Questions