Reputation: 28219
I am a little confused on how shallow copy works, my understanding is when we do new_obj = copy.copy(mutable_obj)
a new object is created with elements of it still pointing to the old object.
Example of where I am confused -
## assignment
i = [1, 2, 3]
j = i
id(i[0]) == id (j[0]) # True
i[0] = 10
i # [10, 2, 3]
j # [10, 2, 3]
## shallow copy
k = copy.copy(i)
k # [10, 2, 3]
id(i) == id(k) # False (as these are two separate objects)
id(i[0]) == id (k[0]) # True (as the reference the same location, right?)
i[0] = 100
id(i[0]) == id (k[0]) # False (why did that value in that loc change?)
id(i[:]) == id (k[:]) # True (why is this still true if an element just changed?)
i # [100, 2, 3]
k # [10, 2, 3]
In shallow copy, isn't k[0]
just pointing to i[0]
similar to assignment? Shouldn't k[0]
change when i[0]
changes?
Why I expect these to be same, because -
i = [1, 2, [3]]
k = copy(i)
i # [1, 2, [3]]
k # [1, 2, [3]]
i[2].append(4)
i # [1, 2, [3, 4]]
k # [1, 2, [3, 4]]
id(i[0]) == id (k[0]) # True
id(i[2]) == id (k[2]) # True
id(i[:]) == id (k[:]) # True
Upvotes: 4
Views: 1043
Reputation: 678
j = i
is an assignment, both j and i point to the same list object.k = copy.copy(i)
is a shallow copy, in which a copy of list object and copy of nested references is made but the internal immutable objects are not copied.i[0] = 100
, the reference in list i points to a new int object with value 100, but the reference in k still references the old int object with value 10.Upvotes: 0
Reputation: 13747
id(i) == id(k) # False (as these are two separate objects)
Correct.
id(i[0]) == id (k[0]) # True (as the reference the same location, right?)
Correct.
i[0] = 100
id(i[0]) == id (k[0]) # False (why did that value in that loc change?)
It changed because you changed it in the previous line. i[0]
was pointing 10
, but you changed it to point to 100
. Therefore, i[0]
and k[0]
now no longer point to the same spot.
Pointers (references) are one way. 10
does not know what is pointing to it. Neither does 100
. They are just locations in memory. So if you change where i
's first element is pointing to, k
doesn't care (since k
and i
are not the same reference). k
's first element is still pointing to what it always was pointing to.
id(i[:]) == id (k[:]) # True (why is this still true if an element just changed?)
This one's a bit more subtle, but note that:
>>> id([1,2,3,4,5]) == id([1,2,3])
True
whereas
>>> x = [1,2,3,4,5]
>>> y = [1,2,3]
>>> id(x) == id(y)
False
It has to do with some subtleties of garbage collection and id, and it's answered in depth here: Unnamed Python objects have the same id.
Long story short, when you say id([1,2,3,4,5]) == id([1,2,3])
, the first thing that happens is we create [1,2,3,4,5]
. Then we grab where it is in memory with the call to id
. However, [1,2,3,4,5]
is anonymous, and so the garbage collector immediately reclaims it. Then, we create another anonymous object, [1,2,3]
, and CPython happens to decide that it should go in the spot that it just cleaned up. [1,2,3]
is also immediately deleted and cleaned up. If you store the references, though, GC can't get in the way, and then the references are different.
The same thing happens with mutable objects if you reassign them. Here's an example:
>>> import copy
>>> a = [ [1,2,3], [4,5,6], [7,8,9] ]
>>> b = copy.copy(a)
>>> a[0].append(123)
>>> b[0]
[1, 2, 3, 123]
>>> a
[[1, 2, 3, 123], [4, 5, 6], [7, 8, 9]]
>>> b
[[1, 2, 3, 123], [4, 5, 6], [7, 8, 9]]
>>> a[0] = [123]
>>> b[0]
[1, 2, 3, 123]
>>> a
[[123], [4, 5, 6], [7, 8, 9]]
>>> b
[[1, 2, 3, 123], [4, 5, 6], [7, 8, 9]]
The difference is when you say a[0].append(123)
, we're modifying whatever a[0]
is pointing to. It happens to be the case that b[0]
is pointing to the same object (a[0]
and b[0]
are references to the same object).
But if you point a[0]
to a new object (through assignment, as in a[0] = [123]
), then b[0]
and a[0]
no longer point to the same place.
Upvotes: 3
Reputation: 896
In Python all things are objects. This includes integers. All lists only hold references to objects. Replacing an element of the list doesn't mean that the element itself changes.
Consider a different example:
class MyInt:
def __init__(self, v):
self.v = v
def __repr__(self):
return str(self.v)
>>> i = [MyInt(1), MyInt(2), MyInt(3)]
[1, 2, 3]
>>> j = i[:] # This achieves the same as copy.copy(i)
[1, 2, 3]
>>> j[0].v = 7
>>> j
[7, 2, 3]
>>> i
[7, 2, 3]
>>> i[0] = MyInt(1)
>>> i
[1, 2, 3]
>>> j
[7, 2, 3]
I am creating a class MyInt here which just holds an int. By modifying an instance of the class, both lists "change". However as I replace a list entry, the lists are now different.
The same happens with integers. You just can't modify them.
Upvotes: 2