Reputation: 36
I am trying to execute a python script which should inform the difference between shallow and deep copy.
From my understanding:
Below is my program:
a = [1,2,3]
print(id(a), id(a[0]), id(a[1]))
print("Lets check shallow copy first")
a1 = copy.copy(a)
print(id(a1), id(a1[0]), id(a1[1]))
a2 = copy.deepcopy(a)
print(id(a2), id(a2[0]), id(a2[1]))
Output:
steven@steven-Inspiron-3537:~/python-learning$ ./deepshallow.py
(139854551376528, 31777112, 31777088)
Lets check shallow copy first
(139854551485040, 31777112, 31777088)
(139854551378616, 31777112, 31777088)
Upvotes: 0
Views: 236
Reputation: 77454
In this particular case, since the elements of your lists are small integers, Python has a built-in mechanism to reference the 'same' object (the integer) without allowing a change to the deeply copied list to cause a change in the list it was copied from.
Here's an example with integers like yours:
In [135]: import copy
In [136]: a1 = [1, 2, 3]
In [137]: a2 = copy.copy(a1)
In [138]: a3 = copy.deepcopy(a1)
In [139]: map(id, a1)
Out[139]: [26960216, 26960192, 26960168]
In [140]: map(id, a2)
Out[140]: [26960216, 26960192, 26960168]
In [141]: map(id, a3)
Out[141]: [26960216, 26960192, 26960168]
So at this point we can see that the lists contain integers, all with the same id. Let's change an element in the deep copy.
In [142]: a3[0] = 1000
In [143]: map(id, a1)
Out[143]: [26960216, 26960192, 26960168]
In [144]: map(id, a2)
Out[144]: [26960216, 26960192, 26960168]
In [145]: map(id, a3)
Out[145]: [39759800, 26960192, 26960168]
Now a3
has a new id for its first entry, meanwhile the other lists are unchanged. Now let's change the first entry of the shallow copy.
In [146]: a2[0] = 1000
In [147]: map(id, a1)
Out[147]: [26960216, 26960192, 26960168]
In [148]: map(id, a2)
Out[148]: [39759200, 26960192, 26960168]
In [149]: map(id, a3)
Out[149]: [39759800, 26960192, 26960168]
Notice how for the integer 1000, which serves as the first entry for both a2
and a3
, there is a different id value.
The reason for this is that the Python runtime actually caches some small integers and other immutable objects, meaning that any place they are referenced, it is a reference to the single cached value.
Here's a source describing it:
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object. So it should be possible to change the value of 1. I suspect the behaviour of Python in this case is undefined. :-)
To see an example where deepcopy
and copy
have meaningfully different behavior, we need something where deepcopy
's recursive calls to copy
will make a difference -- and this won't happen for cached small integers.
Let's try with a list of lists, and we will modify the contents, rather than abruptly changing what one of the top-most list elements refers to:
In [171]: a1 = [[1], [2], [3]]
In [172]: a2 = copy.copy(a1); a3 = copy.deepcopy(a1)
In [173]: a1
Out[173]: [[1], [2], [3]]
In [174]: a2
Out[174]: [[1], [2], [3]]
In [175]: map(id, a1)
Out[175]: [140608561277264, 140608561418040, 140608561277120]
In [176]: map(id, a2)
Out[176]: [140608561277264, 140608561418040, 140608561277120]
In [177]: a2[0][0] = 1000
In [178]: a1
Out[178]: [[1000], [2], [3]]
In [179]: a2
Out[179]: [[1000], [2], [3]]
In [180]: a3
Out[180]: [[1], [2], [3]]
In [181]: a3[1][0] = 1001
In [182]: a1
Out[182]: [[1000], [2], [3]]
In [183]: a2
Out[183]: [[1000], [2], [3]]
In [184]: a3
Out[184]: [[1], [1001], [3]]
Upvotes: 1
Reputation: 1423
What you're seeing here is that the integers are the same between copies, even deep copies. This is because integers are "immutable", and their identities are based upon their values and the Python interpreter's hashing setting.
Try this
a2[0] = 24
assert a2[0] != a1[0]
assert a2[0] != a[0]
a[0] = 72
assert a[0] != a1[0]
You'll see that even a shallow copy is not affected. This is because all of the entries of the list are immutable. For a more appropriate example, try nesting lists or dictionaries within each other and run the same tests on them.
Upvotes: 1