Reputation: 1889
(if you want to skip some 2 A.M. Python science and cut to the chase, my question is summed up at the very end)
Consider the following:
1: animals = ['cat', 'cow', 'donkey', 'horse'] # we start with a list
2: animals_reference = animals # make another reference and assign it to animals
3: cat = animals[0] # refer cat to first element of animals
4: assert cat is animals[0] # no copy occurred, still same object
5: animals[0] = animals[0].capitalize() # change first element of list
6: assert cat is not animals[0] # animals[0] now refers to another object
7: assert animals_reference is animals # animals still points to the same object as before
My understanding was that the underlying structure of a Python list was a C array (with lots of dynamic stuff going on, but still, at the end of the day, a C array.)
What's confusing me is this: we set cat
to refer to the first element of the list (3). In C, that'd be referring it to the address of the first element of the array.
We then modify the first element of the list (5).
But after doing that, cat no longer refers to that object (6). However, the list reference hasn't changed either, since in (7) we see that it has pointed to the same object since the beginning.
This is messing with my mind because it suggests that cat now refers to something else, even though it was never reassigned.
So I performed the following experiment:
cat = animals[0] # refer cat to first element of animals
assert cat is animals[0] # no copy occurred, still same object
print("id of cat: {}".format(hex(id(cat))))
print("id of animals[0]: {}".format(hex(id(animals[0]))))
print("id of animals[]: {}".format(hex(id(animals))))
print("capitalizing animals[0]...")
animals[0] = animals[0].capitalize()
print("-id of cat: {}".format(hex(id(cat))))
print("-id of animals[0]: {}".format(hex(id(animals[0]))))
print("-id of animals[]: {}".format(hex(id(animals))))
With the output:
id of cat: 0xffdda580
id of animals[0]: 0xffdda580
id of animals[]: 0xffddc828
capitalizing animals[0]...
-id of cat: 0xffdda580 # stayed the same!
-id of animals[0]: 0xffe12d40 # changed!!
-id of animals[]: 0xffddc828
This has led me to believe that Python lists are not necessarily contiguous elements of memory and changes to an element will simply point to somewhere else in memory? I mean, the address of the first element of the array is earlier in memory than the address of the array itself!
What exactly is the underlying structure that lists use that explains what I saw?
Upvotes: 2
Views: 133
Reputation: 251378
Here's one way to think about it:
1: animals = ['cat', 'cow', 'donkey', 'horse'] # we start with a list
2: animals_reference = animals # make another reference and assign it to animals
3: cat = animals[0] # refer cat to first element of animals
This does not make cat
refer to "the first element of animals", at least not in the way you seem to mean. It makes cat
refer to whatever the first element of animals refers to. In this case, that is the string "cat". In other words, the expression animals[0]
is itself a reference to an object. That object is the string cat
. When you do animals[0]
, you get the object that the expression animals[0]
refers to. When you do cat = animals[0]
, you set cat
to refer to that object.
There is no way to avoid "dereferencing" the value animals[0]
. That is, there is no way to say "give me the pointingness of animals[0]
so that when animals[0]
starts pointing at something else, my new variable will also point at that something else". You can only get what animals[0]
refers to, not its referring-ness itself.
Thus:
4: assert cat is animals[0] # no copy occurred, still same object
5: animals[0] = animals[0].capitalize() # change first element of list
Here you change what animals[0]
points at. But you set cat
to be what animals[0]
used to point at. So now cat
and animals[0]
point at different things. The string "cat"
did not change (which is why your is
test still shows the values are the same); it's just that the animals[0]
stopped pointing at that string and started pointing at the string "Cat"
instead.
Upvotes: 2
Reputation: 2717
In python, everything is an object, which means that everything is stored in the heap.
When you define
animals = ['cat', 'cow', 'donkey', 'horse']
each of the strings ('cat'
, ...) is stored in the heap. The list animals
holds references to each of those strings.
Assigning cat = animals[0]
, makes cat
hold a reference to the string 'cat' (the same reference held by animals[0]
.
Assigning animals[0] = animals[0].capitalize()
creates a new string ('Cat'
) and changes the reference held by animals[0]
to the new string. However, cat
still holds a reference to the original object in the heap.
Upvotes: 2
Reputation: 90899
Taking your C example itself -
Lets assume A
is an array , and the array contains objects (not primitives) ,
Lets say , A is sequentially stored in addresses - 1000, 1008, 1016 , etc
, that is the first element's reference is stored in address - 1000
. Lets say the first element itself is stored in 2000
.
when you do -
cat = A[0]
You do not get the address where the address where the first element of A is stored (instead) you get the reference to the first element . That is you do not get 1000
, you get 2000
.
Now, if you change the element stored in A[0] to lets say a object in address 3000, that is you make the reference in address 1000
change to 3000
. Do you think cat's reference would change?
Upvotes: -1
Reputation: 102862
The id of cat
doesn't change because it's a reference to the first element of the original animals
list. When that element is changed (by being capitalized), the id of animals[0]
changes, but cat
does not, because it's a reference to the original animals[0]
, which is a reference to a string object containing the letters cat
, not the current animals[0]
, which is now a reference to another string object containing the letters Cat
.
The list animals
still exists, and has been modified in-place, so its id does not change. Since lists are mutable, they can't simply be a contiguous region of memory when originally created, as an object may be added that is larger than the preassigned block of memory.
A Python list is a dynamic object containing references to other objects (or nothing, if it's empty). If it was just a static C array, there would be no point in using Python, you'd just use C. The point of Python lists (well, one of them, anyway) is that they're mutable - dynamic, changeable, reorderable, stretchable, collapsible, sortable, etc.
Upvotes: -1