Reputation: 159
I have noticed something with the "identity", i.e., value returned by id()
, of slices of certain sequence types that I simply can't wrap my head around. I see it with lists and strings, which makes me think it is related to the implementation of either sequences or slices in CPython.
As covered in 3. Data model - Python 3.11.0 documentation:
CPython implementation detail: For CPython, id(x) is the memory address where x is stored.
>>> l = list(range(10))
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>>
>>> l[3]
3
>>> l[7]
7
>>>
>>> id(l)
1931973192256
>>> id(l[3])
1931941276016
>>> id(l[7])
1931941276144
So far nothing unusual in that I expect to see different memory addresses for the list object than individual elements returned from the list. However, the memory addresses don't make sense to me when looking at slices of the list:
>>> l[2:]
[2, 3, 4, 5, 6, 7, 8, 9]
>>> l[5:]
[5, 6, 7, 8, 9]
>>> l[3:9]
[3, 4, 5, 6, 7, 8]
>>> l[:-6]
[0, 1, 2, 3]
>>>
>>> id(l[2:])
2813785568448
>>> id(l[5:])
2813784049664
>>> id(l[3:9])
2813784049664
>>> id(l[:-6])
2813784049664
>>> id(l[2:])
2813784049664
After the first slice of the list, id()
is returning the same value regardless of what the slice looks like afterwards.
My question, what exactly is id()
returning the memory address for when a list is sliced? And a follow-up, why is the first and second slice identity different but slices after are the same, including the first slice?
Upvotes: 0
Views: 90
Reputation: 24797
This behaviour is because of the Cpython optimisation. Lets start from
>>> l[2:]
Slicing a list always creates a new list. Since allocating and deallocating a list object is an expensive operation, python interpreter keep track of free_list
which is an array of pointers to PyListObject
.
/* Empty list reuse scheme to save calls to malloc and free */
#ifndef PyList_MAXFREELIST
# define PyList_MAXFREELIST 80
#endif
struct _Py_list_state {
#if PyList_MAXFREELIST > 0
PyListObject *free_list[PyList_MAXFREELIST];
int numfree;
#endif
};
So when you request for a new list(Remember slicing also create a new list) it's served from this free list(if free list is available).
if (PyList_MAXFREELIST && state->numfree) {
state->numfree--;
op = state->free_list[state->numfree];
OBJECT_STAT_INC(from_freelist);
_Py_NewReference((PyObject *)op);
}
Since you are throwing away the list object immediately the python interpreter invokes list_dealloc
which will restore this free_list
.
if (state->numfree < PyList_MAXFREELIST && PyList_CheckExact(op)) {
state->free_list[state->numfree++] = op;
OBJECT_STAT_INC(to_freelist);
}
When you do the slicing next time you are essentially getting reference to the same list object from free_list
.
Upvotes: 0