bob
bob

Reputation: 1889

Modifying a list with existing references to elements?

(if you want to skip some 2 A.M. Python science and cut to the chase, my question is summed up at the very end)

Consider the following:

1: animals = ['cat', 'cow', 'donkey', 'horse']  # we start with a list
2: animals_reference = animals  # make another reference and assign it to animals

3: cat = animals[0]  # refer cat to first element of animals
4: assert cat is animals[0]  # no copy occurred, still same object

5: animals[0] = animals[0].capitalize()  # change first element of list
6: assert cat is not animals[0]  # animals[0] now refers to another object
7: assert animals_reference is animals  # animals still points to the same object as before

My understanding was that the underlying structure of a Python list was a C array (with lots of dynamic stuff going on, but still, at the end of the day, a C array.)

What's confusing me is this: we set cat to refer to the first element of the list (3). In C, that'd be referring it to the address of the first element of the array.

We then modify the first element of the list (5).

But after doing that, cat no longer refers to that object (6). However, the list reference hasn't changed either, since in (7) we see that it has pointed to the same object since the beginning.

This is messing with my mind because it suggests that cat now refers to something else, even though it was never reassigned.

So I performed the following experiment:

cat = animals[0]  # refer cat to first element of animals
assert cat is animals[0]  # no copy occurred, still same object
print("id of cat:        {}".format(hex(id(cat))))
print("id of animals[0]: {}".format(hex(id(animals[0]))))
print("id of animals[]:  {}".format(hex(id(animals))))

print("capitalizing animals[0]...")
animals[0] = animals[0].capitalize()
print("-id of cat:        {}".format(hex(id(cat))))
print("-id of animals[0]: {}".format(hex(id(animals[0]))))
print("-id of animals[]:  {}".format(hex(id(animals))))

With the output:

id of cat:         0xffdda580
id of animals[0]:  0xffdda580
id of animals[]:   0xffddc828
capitalizing animals[0]...
-id of cat:        0xffdda580  # stayed the same!
-id of animals[0]: 0xffe12d40  # changed!!
-id of animals[]:  0xffddc828

This has led me to believe that Python lists are not necessarily contiguous elements of memory and changes to an element will simply point to somewhere else in memory? I mean, the address of the first element of the array is earlier in memory than the address of the array itself!

What exactly is the underlying structure that lists use that explains what I saw?

Upvotes: 2

Views: 133

Answers (4)

BrenBarn
BrenBarn

Reputation: 251378

Here's one way to think about it:

1: animals = ['cat', 'cow', 'donkey', 'horse']  # we start with a list
2: animals_reference = animals  # make another reference and assign it to animals

3: cat = animals[0]  # refer cat to first element of animals

This does not make cat refer to "the first element of animals", at least not in the way you seem to mean. It makes cat refer to whatever the first element of animals refers to. In this case, that is the string "cat". In other words, the expression animals[0] is itself a reference to an object. That object is the string cat. When you do animals[0], you get the object that the expression animals[0] refers to. When you do cat = animals[0], you set cat to refer to that object.

There is no way to avoid "dereferencing" the value animals[0]. That is, there is no way to say "give me the pointingness of animals[0] so that when animals[0] starts pointing at something else, my new variable will also point at that something else". You can only get what animals[0] refers to, not its referring-ness itself.

Thus:

4: assert cat is animals[0]  # no copy occurred, still same object

5: animals[0] = animals[0].capitalize()  # change first element of list

Here you change what animals[0] points at. But you set cat to be what animals[0] used to point at. So now cat and animals[0] point at different things. The string "cat" did not change (which is why your is test still shows the values are the same); it's just that the animals[0] stopped pointing at that string and started pointing at the string "Cat" instead.

Upvotes: 2

Fernando Matsumoto
Fernando Matsumoto

Reputation: 2717

In python, everything is an object, which means that everything is stored in the heap.

When you define

animals = ['cat', 'cow', 'donkey', 'horse']

each of the strings ('cat', ...) is stored in the heap. The list animals holds references to each of those strings.

Assigning cat = animals[0], makes cat hold a reference to the string 'cat' (the same reference held by animals[0].

Assigning animals[0] = animals[0].capitalize() creates a new string ('Cat') and changes the reference held by animals[0] to the new string. However, cat still holds a reference to the original object in the heap.

Upvotes: 2

Anand S Kumar
Anand S Kumar

Reputation: 90899

Taking your C example itself -

Lets assume A is an array , and the array contains objects (not primitives) ,

Lets say , A is sequentially stored in addresses - 1000, 1008, 1016 , etc , that is the first element's reference is stored in address - 1000 . Lets say the first element itself is stored in 2000.

when you do -

cat = A[0]

You do not get the address where the address where the first element of A is stored (instead) you get the reference to the first element . That is you do not get 1000 , you get 2000 .

Now, if you change the element stored in A[0] to lets say a object in address 3000, that is you make the reference in address 1000 change to 3000 . Do you think cat's reference would change?

Upvotes: -1

MattDMo
MattDMo

Reputation: 102862

The id of cat doesn't change because it's a reference to the first element of the original animals list. When that element is changed (by being capitalized), the id of animals[0] changes, but cat does not, because it's a reference to the original animals[0], which is a reference to a string object containing the letters cat, not the current animals[0], which is now a reference to another string object containing the letters Cat.

The list animals still exists, and has been modified in-place, so its id does not change. Since lists are mutable, they can't simply be a contiguous region of memory when originally created, as an object may be added that is larger than the preassigned block of memory.

A Python list is a dynamic object containing references to other objects (or nothing, if it's empty). If it was just a static C array, there would be no point in using Python, you'd just use C. The point of Python lists (well, one of them, anyway) is that they're mutable - dynamic, changeable, reorderable, stretchable, collapsible, sortable, etc.

Upvotes: -1

Related Questions