Reputation: 6569
An object of my class has a list as its attribute. That is,
class T(object):
def __init__(self, x, y):
self.arr = [x, y]
When this object is copied, I want a separate list arr
, but a shallow copy of the content of the list (e.g. x
and y
). Therefore I decide to implement my own copy method, which will recreate the list but not the items in it. But should I call this __copy__()
or __deepcopy__()
? Which one is the right name for what I do, according to Python semantics?
My guess is __copy__()
. If I call deepcopy()
, I would expect the clone to be totally decoupled from the original. However, the documentation says:
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
It is confusing by saying "inserts copies into it" instead of "inserts deep copies into it", especially with the emphasis.
Upvotes: 7
Views: 987
Reputation: 24298
I think overwriting __copy__
is a good idea, as explained by the other answers with great detail. An alternative solution might be to write an explicit copy method to be absolutely clear on what is called:
class T(object):
def __init__(self, x, y):
self.arr = [x, y]
def copy(self):
return T(*self.arr)
When any future reader sees t.copy()
he then immediately knows that a custom copy method has been implemented. The disadvantage is of course that it might not play well with third-party libraries that use copy.copy(t)
.
Upvotes: 1
Reputation: 152667
It actually depends on the desired behaviour of your class that factor into the decision what to override (__copy__
or __deepcopy__
).
In general copy.deepcopy
works mostly correctly, it just copies everything (recursivly) so you only need to override it if there is some attribute that must not be copied (ever!).
On the other hand one should define __copy__
only if users (including yourself) wouldn't expect changes to propagate to copied instances. For example if you simply wrap a mutable type (like list
) or use mutable types as implementation detail.
Then there is also the case that the minimal set of attributes to copy isn't clearly defined. In that case I would also override __copy__
but maybe raise a TypeError
in there and possible include one (or several) dedicated public copy
methods.
However in my opinion the arr
counts as implementation detail and thus I would override __copy__
:
class T(object):
def __init__(self, x, y):
self.arr = [x, y]
def __copy__(self):
new = self.__class__(*self.arr)
# ... maybe other stuff
return new
Just to show it works as expected:
from copy import copy, deepcopy
x = T([2], [3])
y = copy(x)
x.arr is y.arr # False
x.arr[0] is y.arr[0] # True
x.arr[1] is y.arr[1] # True
x = T([2], [3])
y = deepcopy(x)
x.arr is y.arr # False
x.arr[0] is y.arr[0] # False
x.arr[1] is y.arr[1] # False
Just a quick note about expectations:
Users generally expect that you can pass an instance to the constructor to create a minimal copy (similar or identical to __copy__
) as well. For example:
lst1 = [1,2,3,4]
lst2 = list(lst1)
lst1 is lst2 # False
Some Python types have an explicit copy
method, that (if present) should do the same as __copy__
. That would allow to explicitly pass in parameters (however I haven't seen this in action yet):
lst3 = lst1.copy() # python 3.x only (probably)
lst3 is lst1 # False
If your class should be used by others you probably need to consider these points, however if you only want to make your class work with copy.copy
then just overwrite __copy__
.
Upvotes: 3
Reputation: 362786
The correct magic method for you to implement here is __copy__
.
The behaviour you've described is deeper than what the default behaviour of copy would do (i.e. for an object which hasn't bothered to implement __copy__
), but it is not deep enough to be called a deepcopy. Therefore you should implement __copy__
to get the desired behaviour.
Do not make the mistake of thinking that simply assigning another name makes a "copy":
t1 = T('google.com', 123)
t2 = t1 # this does *not* use __copy__
That just binds another name to the same instance. Rather, the __copy__
method is hooked into by a function:
import copy
t2 = copy.copy(t1)
Upvotes: 5