lupl
lupl

Reputation: 954

Behavior of __new__ in a metaclass (also in context of inheritance)

Ok, obviously __new__ in a metaclass runs when an instance of the metaclass i.e. a class object is instantiated, so __new__ in a metaclass provides a hook to intercept events (/code that runs) at class definition time (e.g. validating/enforcing rules for class attributes such as methods etc.).

Many online examples of __new__ in a metaclass return an instance of the type constructor from __new__, which seems a bit problematic since this blocks __init__ (docs: "If __new__() does not return an instance of cls, then the new instance’s __init__() method will not be invoked").

While tinkering with return values of __new__ in a metaclass I came across some somewhat strange cases which I do not fully understand, e.g.:

class Meta(type):
    
    def __new__(self, name, bases, attrs):
        print("Meta __new__ running!")
        # return type(name, bases, attrs)                     # 1. 
        # return super().__new__(self, name, bases, attrs)    # 2.
        # return super().__new__(name, bases, attrs)          # 3.  
        # return super().__new__(type, name, bases, attrs)    # 4. 
        # return self(name, bases, attrs)                     # 5.

    def __init__(self, *args, **kwargs):
        print("Meta __init__ running!")
        return super().__init__(*args, **kwargs)
    
class Cls(metaclass=Meta):
    pass
  1. This is often seen in examples and generally works, but blocks __init__
  2. This works and __init__ also fires; but why pass self to a super() call? Shouldn't self/cls get passed automatically with super()?
  3. This throws a somewhat strange error I can't really make sense of: TypeError: type.__new__(X): X is not a type object (str); what is X? shouldn't self be auto-passed?
  4. The error of 3. inspired me to play with the first arg of the super() call, so I tried to pass type directly; this also blocks __init__. What is happening here?
  5. tried just for fun; this leads to a RecursionError

Also especially cases 1. and 2. appear to have quite profound implications for inheriting from classes bound to metaclasses:

class LowerCaseEnforcer(type):
    """ Allows only lower case names as class attributes! """

    def __new__(self, name, bases, attrs): 
        for name in attrs:
            if name.lower() != name:
                raise TypeError(f"Inappropriate method name: {name}")
            
        # return type(name, bases, attrs)                    # 1.
        # return super().__new__(self, name, bases, attrs)   # 2.

    class Super(metaclass=LowerCaseEnforcer):
        pass
    
    class Sub(Super):
        
        def some_method(self):
            pass
    
        ## this will error in case 2 but not case 1
        def Another_method(self):
            pass
  1. expected behavior: metaclass is bound to superclass, but not to subclass
  2. binds the superclass /and/ subclasses to the metaclass; ?

I would much appreciate if someone could slowly and kindly explain what exactly is going on in the above examples! :D

Upvotes: 8

Views: 3856

Answers (2)

lupl
lupl

Reputation: 954

I think I finally figured this out (somewhat), my initial confusion can mainly be ascribed to my failure to realize that

  1. there is a difference between object.__new__ and type.__new__
  2. there is a difference between returning type() and returning super().__new__ from a metaclass

Discussion of these two points should clear up my initial example as well as the seemingly enigmatic inheritance behavior.

1. The difference between object.__new__ and type.__new__

First a few words concerning __new__. The documenation is imo pretty clear on this, but I'd still like to add and/or emphasize some things:

  • __new__ can be understood as a special cased static method that takes cls as first parameter and passes the remaining parameters (most often *args, **kwargs) to __init__. _ __new__ and __init__ are invoked successively (actually by the metaclass's __call__!), whereby __init__ is only invoked if __new__ returns an instance of cls.
  • __new__ takes a single argument (this is about calling __new__, not defining/overloading it) i.e. cls and returns an instance of that class.

An important thing that eluded me at first is that there is a difference between object.__new__ and type.__new__. I discovered this while I was playing with __new__'s parameters/arguments; take a look at these 'instructive errors':

class ObjNewExample:
    
    def __new__(cls, *args, **kwargs):
        # return super().__new__(cls)                      # correct 
        return super().__new__(cls, *args, **kwargs)       # instructive error

    def __init__(self, some_attr):
        self._some_attr = some_attr

one = ObjNewExample(42)

class TypeNewExample(type):
    
    def __new__(mcls, name, bases, attrs):
        # return super().__new__(mcls, name, bases, attrs)  # correct
        return super().__new__(mcls)                        # instructive error

# class Cls(metaclass=TypeNewExample):
#     pass

ObjNewexample with return super().__new__(cls, *args, **kwargs) throws something like

  • TypeError: object.__new__() takes exactly one argument (the type to instantiate), while TypeNewexample with return super().__new__(mcls) throws
  • TypeError: type.__new__() takes exactly 3 arguments, which shows that object.__new__ and type.__new__ are quite different methods!

Also: consider the difference between parameters and arguments with __new__:

  • object.__new__ takes cls, *args, **kwargs as parameters, but requires only cls as argument (*args, **kwargs get passed to __init__)
  • type.__new__ takes mcls, name, bases, attrs as parameters and arguments

The difference between returning type() and returning super().__new__ from a metaclass

The main problem with the example I initially posted however is the difference between returning type() and returning super().__new__ from a metaclass's __new__ (which is embarrassingly obvious now..). (See also this discussion)

  • returning type(name, bases, attrs) from mcls: creates an instance of type
  • returning super().__new__(mcls, name, bases, attrs) from mcls: creates an instance of the actual metaclass (which is derived from type), which also explains why __init__ is inhibited in case 1 but not case 2 of the initial example! (Remember: __init__ does not get invoked if __new__ returns anything but an instance if __new__'s first parameter i.e. (m)cls)

This should be instructive:

class Meta(type):
    def __new__(mcls, name, bases, attrs):
        
        # this creates an intance of type (the default metaclass)
        # This should be eql to super().__new__(type, name, base, attrs)!
        obj1 = type(name, bases, attrs) 
        
        # this creates an instance of the actual custom metaclass (which is derived from type)
        # i.e. it creates an instance of type.__new__'s first arg
        obj2 = super().__new__(mcls, name, bases, attrs)

        print(isinstance(obj1, mcls))
        print(obj1.__class__)
        print(isinstance(obj2, mcls))
        print(obj2.__class__)
        
class Fun(metaclass=Meta):
    pass

So quickly walking through cases 1-5 from my initial post:

1: returns a new type object, that is an instance of type, not the actual custom metaclass (derived from type), thus __init__ of the custom metaclass is inhibited; this appears to be actually equivalent to case 4!

2: as @jsbueno pointed out, this is the most likely intended ('correct') behavior: this creates an instance of the actual custom metaclass.

3: this barfs itself because type.__new__ expects an object of type type (the object to be instantiated) as first argument

4: see case 1

5: self (probably better named 'cls' or 'mcls') is Meta; calling a class in its own constructor is obviously recursive.

The above also provides an explanation for the seemingly weird inheritance behavior of the second snippet from my initial posts! So why does Sub's definition of Another_method error in case 2 of LowerCaseEnforcer, but not case 1?

Because in case 1 Lowercaseenforcer returns an instance of type (not of LowerCaseEnforcer!), so Super is of type type (its metaclass is type, not LowerCaseEnforcer)! So while LowerCaseEnforcer.__new__ fires and enforces the lowercase restriction for Super, Super is just a vanilla class of type type and Sub is derived from it (with no special effect).

Whereas in case 2 Super's metaclass is of type LowerCaseEnforcer and so is Sub's, so LowerCaseEnforcer.__new__ is involved in the definition of Sub.

One thing that is still a bit unclear however is the behavior of static methods in super calls (see also this discussion). E.g. why does super().__new__(cls) work? Shouldn't this be super(cls, cls).__new__(cls) (or something like that)? But I guess this is another (interesting) topic! :D

Upvotes: 3

jsbueno
jsbueno

Reputation: 110726

It is simpler than what you got too.

As you have noted, the correct thing to do is your 2 above:

return super().__new__(self, name, bases, attrs)    # 2.

Here it goes: __new__ is a special method - although in certain documentations, even part of the official documentation, it is described as being a classmethod, it is not quite so: it, as an object, behaves more like a static method - in a sense that Python does not automatically fill the first parameter when one calls MyClass.__new__() - i.e., you'd have to call MyClass.__new__(MyClass) for it to work. (I am a step back here - this info applies to all classes: metaclasses and ordinary classes).

When you call MyClass() to create a new instance, then Python will call MyClass.__new__ and insert the cls parameter as first parameter.

With metaclasses, the call to create a new instance of the metaclass is triggered by the execution of the class statement and its class body. Likewise, Python fills in the first parameter to Metaclass.__new__, passing the metaclass itself.

When you call super().__new__ from within your metaclass' __new__ you are in the same case of one calling __new__ manually: the parameter specifying which class' that __new__ should apply have to be explicitly filled.

Now, what is confusing you is that you are writting the first parameter to __new__ as self - which would be correct if it were an instance of the metaclass (i.e. an ordinary class). As it is, that parameter is a reference to the metaclass itself.

The docs does not inform an official, or recomended name for the first parameter of a metaclass __new__, but usually it is something along mcls, mcs, metaclass, metacls - to make it different from cls which is the usuall name for the first parameter of a non-metaclass __new__ method. In a metaclass, the "class" - cls is what is created by the ultimate call to type.__new__ (either hardcoded, or using super()) the return of it is the new-born class (it can be further modified in the __new__ method after the call to the superclass) - and when returned, the call to __init__ takes place normally.

So, I will just comment further the use of trying to call type(name, bases, namespace) instead of type.__new__(mcls, name, bases, namespace): the first form will just create a plain class, as if the metaclass had not been used at all - (lines in the metaclass __new__ that modify the namespace or bases, of course, have their effect. But the resulting class will have type as its metaclass, and subclasses of it won't call the metaclass at all. (For the record, it works as a "pre-class decorator" - which can act on class parameters before it is created, and it could even be an ordinary function, instead of a class with a __new__ method - the call to type is what will create the new class after all)

A simple way to check if the metaclass is "bound" to your class is to check its type with type(MyClass) or MyClass.__class__ .

Upvotes: 6

Related Questions