bixb0012
bixb0012

Reputation: 159

Python Identities of Sequence Slices

I have noticed something with the "identity", i.e., value returned by id(), of slices of certain sequence types that I simply can't wrap my head around. I see it with lists and strings, which makes me think it is related to the implementation of either sequences or slices in CPython.

As covered in 3. Data model - Python 3.11.0 documentation:

CPython implementation detail: For CPython, id(x) is the memory address where x is stored.


>>> l = list(range(10))
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>>
>>> l[3]
3
>>> l[7]
7
>>>
>>> id(l)
1931973192256
>>> id(l[3])
1931941276016
>>> id(l[7])
1931941276144

So far nothing unusual in that I expect to see different memory addresses for the list object than individual elements returned from the list. However, the memory addresses don't make sense to me when looking at slices of the list:

>>> l[2:]
[2, 3, 4, 5, 6, 7, 8, 9]
>>> l[5:]
[5, 6, 7, 8, 9]
>>> l[3:9]
[3, 4, 5, 6, 7, 8]
>>> l[:-6]
[0, 1, 2, 3]
>>>
>>> id(l[2:])
2813785568448
>>> id(l[5:])
2813784049664
>>> id(l[3:9])
2813784049664
>>> id(l[:-6])
2813784049664
>>> id(l[2:])
2813784049664

After the first slice of the list, id() is returning the same value regardless of what the slice looks like afterwards.

My question, what exactly is id() returning the memory address for when a list is sliced? And a follow-up, why is the first and second slice identity different but slices after are the same, including the first slice?

Upvotes: 0

Views: 90

Answers (1)

user459872
user459872

Reputation: 24797

This behaviour is because of the Cpython optimisation. Lets start from

>>> l[2:]

Slicing a list always creates a new list. Since allocating and deallocating a list object is an expensive operation, python interpreter keep track of free_list which is an array of pointers to PyListObject.

/* Empty list reuse scheme to save calls to malloc and free */
#ifndef PyList_MAXFREELIST
#  define PyList_MAXFREELIST 80
#endif

struct _Py_list_state {
#if PyList_MAXFREELIST > 0
    PyListObject *free_list[PyList_MAXFREELIST];
    int numfree;
#endif
};

So when you request for a new list(Remember slicing also create a new list) it's served from this free list(if free list is available).

if (PyList_MAXFREELIST && state->numfree) {
        state->numfree--;
        op = state->free_list[state->numfree];
        OBJECT_STAT_INC(from_freelist);
        _Py_NewReference((PyObject *)op);
}

Since you are throwing away the list object immediately the python interpreter invokes list_dealloc which will restore this free_list.

if (state->numfree < PyList_MAXFREELIST && PyList_CheckExact(op)) {
        state->free_list[state->numfree++] = op;
        OBJECT_STAT_INC(to_freelist);
}

When you do the slicing next time you are essentially getting reference to the same list object from free_list.

Upvotes: 0

Related Questions