Reputation: 10103
Being mutable, when a Python list
is extended (e.g., mylist.extend()
or mylist += anotherlist
), the id of the list does not change.
I understand that (at least in CPython) lists are contiguous in memory (and the id happens to be the address of the head of the list). What if the memory after the list is already highly fragmented and the list extension cannot be allocated (even though there's plenty of free space, albeit discontiguous in that area)? Does the allocation fail? How is that mitigated?
Upvotes: 5
Views: 421
Reputation: 44434
This is implementation specific, but I assume you are asking about CPython.
As you say, lists are contiguous memory, but these are C pointers to PyObjects (*PyObject
). The objects themselves, be they integers or objects of your own class, are allocated somewhere in dynamic memory but (probably) not in contiguous memory.
When a list is first allocated then more memory is obtained than is required, leaving "slack" entries on (conceptually) the right-hand side. A list.append()
will just use the next slack entry. Of course eventually they are all full.
At that point a reallocation of memory is performed. In C terms that is a call to realloc()
, but the implementation might not actually use the C run-time heap for this. Basically, a larger memory allocation is obtained from dynamic memory, then all the elements of the list are copied to the new list. Remember though that we are copying pointers to Python objects, not the objects themselves.
There is an overhead in doing this, but the use of the slack entries reduces it. You can see from this that list.append()
is more efficient than adding an entry on the "left" of the list.
If the allocation fails because there is insufficient memory for a new list, then you will get an out of memory error. But if you are that short of memory then any new Python object could cause this.
Heap fragmentation can be an issue, but Python does its best to manage this, which is one reason why Python does some heap management of its own. In recent Windows implementations the low-level virtual memory APIs are used instead of the C RTL so Python can perform its own memory management.
Upvotes: 1
Reputation: 70735
In CPython, this is a difference in the way lists and tuples are allocated. For a list, the object contains a pointer to the memory allocated for the list contents. The list object proper is small, and never needs to move; the address of the vector it points to may change any number of times.
For a tuple object, which is expected to be small most times, the memory for the tuple content is indeed allocated directly in the tuple object. But tuples can't be resized, so your scenario can't arise in that case.
Upvotes: 2