Reputation: 43136
Since this question is about inheritance and super
, let's begin by writing a class. Here's a simple everyday class that represents a person:
class Person:
def __init__(self, name):
super().__init__()
self.name = name
Like every good class should, it calls its parent constructor before initializing itself. And this class does its job perfectly well; it can be used with no problems:
>>> Person('Tom')
<__main__.Person object at 0x7f34eb54bf60>
But when I try to make a class that inherits from both Person
and another class, things suddenly go wrong:
class Horse:
def __init__(self, fur_color):
super().__init__()
self.fur_color = fur_color
class Centaur(Person, Horse):
def __init__(self, name, fur_color):
# ??? now what?
super().__init__(name) # throws TypeError: __init__() missing 1 required positional argument: 'fur_color'
Person.__init__(self, name) # throws the same error
Because of diamond inheritance (with the object
class at the top), it's not possible to initialize Centaur
instances correctly. The super().__init__()
in Person
ends up calling Horse.__init__
, which throws an exception because the fur_color
argument is missing.
But this problem wouldn't exist if Person
and Horse
didn't call super().__init__()
.
This raises the question: Should classes that inherit directly from object
call super().__init__()
? If yes, how would you correctly initialize Centaur
?
Disclaimer: I know what super
does, how the MRO works, and how super
interacts with multiple inheritance. I understand what's causing this error. I just don't know what the correct way to avoid the error is.
Why am I asking specifically about object
even though diamond inheritance can occur with other classes as well? That's because object
has a special place in python's type hierarchy - it sits at the top of your MRO whether you like it or not. Usually diamond inheritance happens only when you deliberately inherit from a certain base class for the goal of achieving a certain goal related to that class. In that case, diamond inheritance is to be expected. But if the class at the top of the diamond is object
, chances are that your two parent classes are completely unrelated and have two completely different interfaces, so there's a higher chance of things going wrong.
Upvotes: 18
Views: 3430
Reputation: 362597
Should classes that inherit directly from
object
callsuper().__init__()
?
You seem to search a simple "yes or no" answer for this question, but unfortunately the answer is "it depends". Furthermore, when deciding if you should call super().__init__()
, it is somewhat irrelevant whether or not a class inherits directly from object
. What is invariant is that, if object.__init__
is called, it should be called without arguments - since object.__init__
does not accept arguments.
Practically, in cooperative inheritance situations, this means you must ensure that all arguments are consumed before object.__init__
gets invoked. It does not mean you should try to avoid object.__init__
being invoked. Here is an example of consuming args before invoking super
, the response
and request
context has been popped out of the mutable mapping kwargs
.
I mentioned earlier that whether or not a class inherits directly from object
is a red herring1. But I didn't mention yet what should motivate this design decision: You should call super init [read: super anymethod
] if you want the MRO to continue to be searched for other initializers [read: other anymethod
s]. You should not invoke super if you want to indicate the MRO search should be stopped here.
Why does object.__init__
exist at all, if it doesn't do anything? Because it does do something: ensures it was called without arguments. The presence of arguments likely indicates a bug2. object
also serves the purpose of stopping the chain of super calls - somebody has to not call super, otherwise we recurse infinitely. You can stop it explicitly yourself, earlier, by not invoking super. If you don't, object
will serve as the final link and stop the chain for you.
Class MRO is determined at compile time, which is generally when a class is defined / when the module is imported. However, note that the use of super
involves many chances for runtime branching. You have to consider:
super
is called with (i.e. which arguments you want to forward along the MRO)super
itself is created with (there is an advanced use case described below)__init__
, but don't forget that super can be used with any other method, tooIn rare circumstances, you might conditionally invoke a super
call. You might check whether your super()
instance has this or that attribute, and base some logic around the result. Or, you might invoke super(OtherClass, self)
to explicitly "step over" a link and manually traverse the MRO for this section. Yes, if the default behaviour is not what you wanted, you can hijack the MRO! What all these diabolical ideas have in common is an understanding of the C3 linearization algorithm, how Python makes an MRO, and how super itself uses the MRO. Python's implementation was more or less lifted from another programming language, where super was named next-method
. Honestly super
is a super-bad name in Python because it causes a common misconception amongst beginners that you're always invoking "up" to one of the parent classes, I wish they had chosen a better name.
When defining an inheritance hierarchy, the interpreter can not know whether you wanted to reuse some other classes existing functionality or to replace it with an alternate implementation, or something else. Either decision could be a valid and practical design. If there was a hard and fast rule about when and how super
should be invoked, it would not be left to the programmer to choose - the language would take the decision out of your hands and just do the right thing automatically. I hope that sufficiently explains that invoking super in __init__
is not a simple yes/no question.
If yes, how would you correctly initialize
SuperFoo
?
(Source for Foo
, SuperFoo
etc in this revision of the question)
For the purposes of answering this part, I will assume the __init__
methods shown in the MCVE actually need to do some initialization (perhaps you could add placeholder comments in the question's MCVE code to that effect). Don't define an __init__
at all if the only you do is call super with same arguments, there's no point. Don't define an __init__
that's just pass
, unless you intentionally mean to halt the MRO traversal there (in which case a comment is certainly warranted!).
Firstly, before we discuss the SuperFoo
, let me say that NoSuperFoo
looks like an incomplete or bad design. How do you pass the foo
argument to Foo
's init? The foo
init value of 3
was hardcoded. It might be OK to hardcode (or otherwise automatically determine) foo's init value, but then you should probably be doing composition not inheritance.
As for SuperFoo
, it inherits SuperCls
and Foo
. SuperCls
looks intended for inheritance, Foo
does not. That means you may have some work to do, as pointed out in super harmful. One way forward, as discussed in Raymond's blog, is writing adapters.
class FooAdapter:
def __init__(self, **kwargs):
foo_arg = kwargs.pop('foo')
# can also use kwargs['foo'] if you want to leave the responsibility to remove 'foo' to someone else
# can also use kwargs.pop('foo', 'foo-default') if you want to make this an optional argument
# can also use kwargs.get('foo', 'foo-default') if you want both of the above
self._the_foo_instance = Foo(foo_arg)
super().__init__(**kwargs)
# add any methods, wrappers, or attribute access you need
@property
def foo():
# or however you choose to expose Foo functionality via the adapter
return self._the_foo_instance.foo
Note that FooAdapter
has a Foo
, not FooAdapter
is a Foo
. This is not the only possible design choice. However, if you are inheriting like class FooParent(Foo)
, then you're implying a FooParent
is a Foo
, and can be used in any place where a Foo
would otherwise be - it's often easier to avoid violations of LSP by using composition. SuperCls
should also cooperate by allowing **kwargs
:
class SuperCls:
def __init__(self, **kwargs):
# some other init code here
super().__init__(**kwargs)
Maybe SuperCls
is also out of your control and you have to adapt it too, so be it. The point is, this is a way to re-use code, by adjusting the interfaces so that the signatures are matching. Assuming everyone is cooperating well and consuming what they need, eventually super().__init__(**kwargs)
will proxy to object.__init__(**{})
.
Since 99% of classes I've seen don't use
**kwargs
in their constructor, does that mean 99% of python classes are implemented incorrectly?
No, because YAGNI. Do 99% of classes need to immediately support 100% general dependency-injection with all the bells and whistles, before they are useful? Are they broken if they don't? As an example, consider the OrderedCounter
recipe given in the collections docs. Counter.__init__
accepts *args
and **kwargs
, but doesn't proxy them in the super init call. If you wanted to use one of those arguments, well tough luck, you've got to override __init__
and intercept them. OrderedDict
isn't defined cooperatively at all, really, some parent calls are hardcoded to dict
- and the __init__
of anything next in line isn't invoked, so any MRO traversal would be stopped in its tracks there. If you accidentally defined it as OrderedCounter(OrderedDict, Counter)
instead of OrderedCounter(Counter, OrderedDict)
the metaclass bases would still be able to create a consistent MRO, but the class just wouldn't work at all as an ordered counter.
In spite of all these shortcomings, the OrderedCounter
recipe works as advertised, because the MRO is traversed as designed for the intended use-case. So, you don't even need to do cooperative inheritance 100% correctly in order to implement a dependency-injection. The moral of the story is that perfection is the enemy of progress (or, practicality beats purity). If you want to cram MyWhateverClass
into any crazy inheritance tree you can dream up, go ahead, but it is up to you to write the necessary scaffolding to allow that. As usual, Python will not prevent you to implement it in whatever hacky way is good enough to work.
1You're always inheriting from object, whether you wrote it in the class declaration or not. Many open source codebases will inherit from object explicitly anyway in order to be cross-compatible with 2.7 runtimes.
2This point is explained in greater detail, along with the subtle relationship between __new__
and __init__
, in CPython sources here.
Upvotes: 14
Reputation:
When using super
to call a superclass method, you would normally need the current class and the current instance as parameters:
super(Centaur, self).__init__(...)
Now, the problem comes with the way Python handles walking the superclasses. If the __init__
methods don't have matching signatures, then the calls will probably cause problems. From the linked example:
class First(object):
def __init__(self):
print "first prologue"
super(First, self).__init__()
print "first epilogue"
class Second(First):
def __init__(self):
print "second prologue"
super(Second, self).__init__()
print "second epilogue"
class Third(First):
def __init__(self):
print "third prologue"
super(Third, self).__init__()
print "third epilogue"
class Fourth(Second, Third):
def __init__(self):
super(Fourth, self).__init__()
print "that's it"
Fourth()
The output is:
$ python2 super.py
second prologue
third prologue
first prologue
first epilogue
third epilogue
second epilogue
that's it
This shows the order the constructors are called. Also note that the subclass constructors all have compatible signatures because they were written with each other in mind.
Upvotes: -1
Reputation: 280426
If Person
and Horse
were never designed to be used as base classes of the same class, then Centaur
probably shouldn't exist. Correctly designing for multiple inheritance is very hard, much more than just calling super
. Even single inheritance is pretty tricky.
If Person
and Horse
are supposed to support creation of classes like Centaur
, then Person
and Horse
(and likely the classes around them) need some redesigning. Here's a start:
class Person:
def __init__(self, *, name, **kwargs):
super().__init__(**kwargs)
self.name = name
class Horse:
def __init__(self, *, fur_color, **kwargs):
super().__init__(**kwargs)
self.fur_color = fur_color
class Centaur(Person, Horse):
pass
stevehorse = Centaur(name="Steve", fur_color="brown")
You'll notice some changes. Let's go down the list.
First, the __init__
signatures now have a *
in the middle. The *
marks the beginning of keyword-only arguments: name
and fur_color
are now keyword-only. It's nearly impossible to get positional arguments to work safely when different classes in a multiple inheritance graph take different arguments, so for safety, we require arguments by keyword. (Things would be different if multiple classes needed to use the same constructor arguments.)
Second, the __init__
signatures all take **kwargs
now. This lets Person.__init__
accept keyword arguments it doesn't understand, like fur_color
, and pass them on to down the line until they reach whatever class does understand them. By the time the parameters reach object.__init__
, object.__init__
should receive empty kwargs
.
Third, Centaur
doesn't have its own __init__
any more. With Person
and Horse
redesigned, it doesn't need an __init__
. The inherited __init__
from Person
will do the right thing with Centaur
's MRO, passing fur_color
to Horse.__init__
.
Upvotes: 11