Reputation: 8593
This question is related to, but not a duplicate of, this, this, this, and this. Those links don't answer my question here. This though, almost answers my questions but doesn't, because the code in the answer doesn't run in Python 3.6 and in any case the question there isn't specifically about what I'm asking here. (See my own answer below.
From the Python documentation page, I find the following text.
__new__()
is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation. It is also commonly overridden in custom metaclasses in order to customize class creation.
But why? Why can't we just override __init__()
instead of having to override __new__()
? Apparently, frozenset
, for example, doesn't even implement __init__()
; why is that? I understand from here that in some rare cases, __new__()
and __init__()
are required to do different things, but as far as I can see that's only during pickling and unpickling. What is it about immutable types in particular that requires the use of __new__()
instead of __init__()
?
Upvotes: 5
Views: 953
Reputation: 51142
It's not really about mutability, at least not directly. You can create an immutable class by overriding __setattr__
without __new__
:
class Immutable:
def __init__(self, x):
self.x = x
self._frozen = True
def __setattr__(self, key, value):
if hasattr(self, '_frozen'):
raise AttributeError(key)
super().__setattr__(key, value)
The reason int
, str
and tuple
override __new__
is really so that the constructors don't necessarily create new instances. Observe in the REPL:
>>> x = 5
>>> int(x) is x
True
>>> s = 'hello'
>>> str(s) is s
True
>>> t = (1, 2)
>>> tuple(t) is t
True
That is, overriding __new__
allows the class to return something other than a new instance. This could be the argument itself if it is already an instance of the class, or it could be a cached object (like the cache for small int
values) to avoid allocating multiple copies of the "same" value:
>>> x = 23
>>> int('23') is x
True
You cannot achieve the same with just __init__
and without __new__
: the __init__
method is only called after a new instance has already been created.
The reason this is related to immutability is that it only really makes sense to reuse equal values like this when they are immutable. For example, it is often necessary to make a copy of a list so that you have a "fresh" object which you can safely mutate without mutating a list somewhere else in the program; in contrast, nobody ever needs a "fresh" int
or str
because you can't mutate them anyway, let alone unsafely mutate them.
Upvotes: 2
Reputation: 8593
I'm the question OP and I'm going to answer my own question because I think I found out the answer half-way through typing it. I'm not going to mark it as correct until others have confirmed it to be correct.
This question here is particularly relevant, but the question wasn't the same as this question, and although the answer was very enlightening (though the comments turned into enlightening but esoteric arguments about C and Python and "pythonic"), it should be set out more clearly here to specifically address this question. I hope this will help future readers. The code in this answer has been verified in Python 3.6.1.
The thing about an immutable object, is that you don't want to set its members once it's been created, obviously. The way you do that in Python is to override the __setattr__()
special method to raise
an error (AttributeError
), so that people can't do things like my_immutable_object.x = 3
. Take the following custom immutable class for example.
class Immutable(object):
def __init__(self, a, b):
self.a = a
self.b = b
def __setattr__(self, key, value):
raise AttributeError("LOL nope.")
Let's try using it.
im = Immutable(2, 3)
print(im.a, im.b, sep=", ")
Output:
AttributeError: LOL nope.
"But what!?", I hear you ask, "I didn't set any of its attributes after it's been created!" Ah but yes you did, in the __init__()
. Since __init__()
is called after the object is created, the lines self.a = a
and self.b = b
are setting the attributes a
and b
after the creation of im
. What you really want is to set the attributes a
and b
before the immutable object is created. An obvious way to do that is to create a mutable type first (whose attributes you are allowed to set in __init__()
), and then make the immutable type a subclass of it, and make sure you implement the __new__()
method of the immutable child class to construct a mutable version first, and then make it immutable, like the following.
class Mutable(object):
def __init__(self, a, b):
self.a = a
self.b = b
class ActuallyImmutable(Mutable):
def __new__(cls, a, b):
thing = Mutable(a, b)
thing.__class__ = cls
return thing
def __setattr__(self, key, value):
raise AttributeError("LOL nope srsly.")
Now let's try running it.
im = ActuallyImmutable(2, 3)
print(im.a, im.b, sep=", ")
Output:
AttributeError: LOL nope srsly.
"WTF!? When did __setattr__()
get called this time?" The thing is, ActuallyImmutable
is a subclass of Mutable
, and without explicitly implementing its __init__()
, the parent class's __init__()
is automatically called after the creation of the ActuallyImmutable
object, so in total the parent's __init__()
is called twice, once before the creation of im
(which is OK) and once after (which is not OK). So let's try again, this time overriding AcutallyImmutable.__init__()
.
class Mutable(object):
def __init__(self, a, b):
print("Mutable.__init__() called.")
self.a = a
self.b = b
class ActuallyImmutable(Mutable):
def __new__(cls, a, b):
thing = Mutable(a, b)
thing.__class__ = cls
return thing
# noinspection PyMissingConstructor
def __init__(self, *args, **kwargs):
# Do nothing, to prevent it from calling parent's __init__().
pass
def __setattr__(self, key, value):
raise AttributeError("LOL nope srsly.")
Now it should work.
im = ActuallyImmutable(2, 3)
print(im.a, im.b, sep=", ")
Output:
2, 3
Good, it worked. Oh, don't worry about the # noinspection PyMissingConstructor
, that's just a PyCharm hack to stop PyCharm from complaining that I didn't call the parent's __init__()
, which obviously is what we intend here. And finally just to check that im
really is immutable, verify that im.a = 42
will give you AttributeError: LOL nope srsly.
.
Upvotes: 11