Reputation: 39457
In this example below, the method m
on class A
is called just once.
I understand this is a feature, this is the Pythonic way to solve the issue where A
's m
method would be called twice (if it was implemented in the naive way) in this diamond-like inheritance scenario.
This is all described here:
https://www.python-course.eu/python3_multiple_inheritance.php
(1) But under the hood... how did they achieve this behavior i.e. that class A
's m
method is called ONLY once?!
Asked simplistically: which line is being "skipped" during execution - is it line #1
or line # 2
?
Could someone shed more light on this?
I have never used multiple inheritance seriously because I mostly program in Java. So I am really curious about this scenario here and more specifically to the inner-workings behind it.
Note: I just want to get the general idea of how this works in Python, not really understand every tiny detail here.
(2) What if I want (in this same scenario and for some reason) A
's m
method to be called twice (or N
times depending on how many base classes of D
we have), while still going through using super()
. Is this possible? Does super()
support such mode of operation?
(3) Is this just some tree or DAG visiting algorithm where they keep track which class's m
method has already been visited and just don't visit it (call it) twice? If so then simplistically speaking I guess '# 2' is the line which is skipped.
class A:
def m(self):
print("m of A called")
class B(A):
def m(self):
print("m of B called")
super().m() # 1
class C(A):
def m(self):
print("m of C called")
super().m() # 2
class D(B,C):
def m(self):
print("m of D called")
super().m()
if (__name__ == '__main__'):
x = D()
x.m()
Upvotes: 3
Views: 370
Reputation: 13868
This has to do with the Method Resolution Order, which the article you linked already provided some insight (and more information from this other article as well):
The question arises how the super functions makes its decision. How does it decide which class has to be used? As we have already mentioned, it uses the so-called method resolution order(MRO). It is based on the C3 superclass linearisation algorithm. This is called a linearisation, because the tree structure is broken down into a linear order. The mro method can be used to create this list:
>>> from super_init import A,B,C,D` >>> D.mro() [<class 'super_init.D'>, <class 'super_init.B'>, <class 'super_init.C'>, <class 'super_init.A'>, <class 'object'>]`
Pay attention to the MRO where it goes from D
> B
> C
> A
. Where you believe super()
to be simply calling the parent class of the current scope - it is not. It is looking through your object's class MRO (i.e. D.mro()
) with current class (i.e. B
, C
...) to determine which is the next class in line to resolve the method.
The super()
actually uses two arguments, but when called with zero arguments inside a class, it's implicitly passed:
Also note that, aside from the zero argument form,
super()
is not limited to use inside methods. The two argument form specifies the arguments exactly and makes the appropriate references. The zero argument form only works inside a class definition, as the compiler fills in the necessary details to correctly retrieve the class being defined, as well as accessing the current instance for ordinary methods.
To be precise, at the point of B.m()
, the super()
call actually translates to:
super(B, x).m()
# because the self being passed at the time is instance of D, which is x
That call resolves within the D.mro()
from the B
class onward, which actually is C
, not A
as you imagined. Therefore, C.m()
is called first, and within it, the super(C, x).m()
resolves to A.m()
and that is called.
After that, it resolves back to after the super()
within C.m()
, back up to after the super()
within B.m()
, and back up to D.m()
. This is easily observed when you add a few more lines:
class A:
def m(self):
print("m of A called")
class B(A):
def m(self):
print("m of B called")
print(super())
super().m() # resolves to C.m
print('B.m is complete')
class C(A):
def m(self):
print("m of C called")
print(super())
super().m() # resolves to A.m
print('C.m is complete')
class D(B,C):
def m(self):
print("m of D called")
print(super())
super().m() # resolves to B.m
print('D.m is complete')
if (__name__ == '__main__'):
x = D()
x.m()
print(D.mro())
Which results in:
m of D called <super: <class 'D'>, <D object>> m of B called <super: <class 'B'>, <D object>> m of C called <super: <class 'C'>, <D object>> m of A called C.m is complete # <-- notice how C.m is completed before B.m B.m is complete D.m is complete [<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <class 'object'>]
So in actuality, nothing is ever called twice or skipped. You just misinterpreted the idea of the MRO resolving from the call based on the scope where super()
is, as opposed to the call from the initial object.
Here's another fun little example to demonstrate the MRO in more details:
def print_cur_mro(cls, obj):
# helper function to show current MRO
print(f"Current MRO: {' > '.join([f'*{m.__name__}*' if m.__name__ == cls.__name__ else m.__name__ for m in type(obj).mro()])}")
class X:
def m(self):
print('m of X called')
print_cur_mro(X, self)
try:
super().a_only() # Resolves to A.a_only if called from D(), even though A is not in X inheritance
except AttributeError as exc:
# Resolves to AttributeError if not called from D()
print(type(exc), exc)
print('X.m is complete')
class A:
def m(self):
print("m of A called")
print_cur_mro(A, self)
def a_only(self):
print('a_only called')
class B(X):
def m(self):
print("m of B called")
print_cur_mro(B, self)
super().m() # Resolves to X.m
print('B.m is complete')
def b_only(self):
print('b_only called')
class C(A):
def m(self):
print("m of C called")
print_cur_mro(C, self)
try:
super().b_only() # Resolves to AttributeError if called, since A.b_only doesn't exist if from D()
except AttributeError as exc:
print(type(exc), exc)
super().m() # Resolves to A.m
print('C.m is complete')
def c_only(self):
print('c_only called, calling m of C')
C.m(self)
class D(B,C):
def m(self):
print("m of D called")
print_cur_mro(D, self)
super().c_only() # Resolves to C.c_only, since c_only doesn't exist in B or X.
super().m() # Resolves to B.m
print('D.m is complete')
if (__name__ == '__main__'):
x = D()
x.m()
print(D.mro())
x2 = X()
x2.m()
print(X.mro())
Result:
# x.m() call: m of D called Current MRO: *D* > B > X > C > A > object c_only called, calling m of C m of C called Current MRO: D > B > X > *C* > A > object <class 'AttributeError'> 'super' object has no attribute 'b_only' m of A called Current MRO: D > B > X > C > *A* > object C.m is complete m of B called Current MRO: D > *B* > X > C > A > object m of X called Current MRO: D > B > *X* > C > A > object a_only called X.m is complete B.m is complete D.m is complete # D.mro() call: [<class '__main__.D'>, <class '__main__.B'>, <class '__main__.X'>, <class '__main__.C'>, <class '__main__.A'>, <class 'object'>] # x2.m() call: m of X called Current MRO: *X* > object <class 'AttributeError'> 'super' object has no attribute 'a_only' X.m is complete # X.mro() call: [<class '__main__.X'>, <class 'object'>]
Upvotes: 6