luochen1990
luochen1990

Reputation: 3847

How to make a dataclass-like decorator friendly for pylance?

I'm using pylance and enabled the strict mode, and hoping for better developing experience.

It works well until I define some class decorator

def struct(cls : Type[Any]) -> Type[Any]:
    # ... do some magic here ...
    return dataclass(frozen=True)(cls)

@struct
class Vec:
    x: int
    y: int

print(Vec(1, "abc")) # no error msg here, no hints about constructor arguments also.

Here, when I'm typing Vec(, there is no hints about types of constructor arguments, and when I'm typing Vec(1, "abc"), there is no type error occurs.

And I find that defining @struct as generic function (instead of use Any) makes things even worse:

A = TypeVar("A")

def struct(cls : Type[A]) -> Type[A]:
    # ... do some magic here ...
    return dataclass(frozen=True)(cls)

@struct
class Vec:
    x: int
    y: int

print(Vec(1, 2)) # type error here: Expected no arguments to Vec

In this case, when I'm typing Vec(1, 2), a type error occurs, and it says "Expected no arguments to Vec", which is not expected.

I hope that there is some way I can tell pylance (or other static linter) about the meta information of the returned class (maybe generated from the original class via typing.get_type_hints, but there is a promise from me that the metadata of the returned class is not dynamically modified after that).

I noticed that pylance can deal with @dataclass very well, so I think there might be some mechanism to achieve that.

Is there any way to do that? or @dataclass is just special processed by pylance?

Upvotes: 5

Views: 1990

Answers (1)

kolokkol
kolokkol

Reputation: 183

If I understand your problem correctly, PEP 681 (Data Class Transforms) may be able to help you -- provided that you would be able to use Python 3.11 (at the time of writing, only pre-release versions of Python 3.11 are available)

Data class transforms were added to allow library authors to annotate functions or classes which provide behaviour similar to dataclasses.dataclass. The intended use case seems to be exactly what you are describing: allow static type checkers to infer when code is generated dynamically in a "dataclass-like" way. The PEP introduces a single decorator, typing.dataclass_transform. This decorator can be used to mark functions or classes that dynamically generate "dataclass-like" classes. If necessary, the decorator also allows you to specify some details about the generated classes (e.g. whether __eq__ is implemented by default). For all details, you can checkout PEP 681 or the documentation.

The most basic case would be changing your code to

@dataclass_transform()  # <-- from typing import dataclass_transform
def struct(cls : Type[Any]) -> Type[Any]:
    # ... do some magic here ...
    return dataclass(frozen=True)(cls)

If you now write

print(Vec(1, "abc"))

You will get an error from PyLance:

Argument of type "Literal['2']" cannot be assigned to parameter "y" of type "int" in function "__init__"
"Literal['2']" is incompatible with "int"PylancereportGeneralTypeIssues

If I understand correctly, dataclass_transform should also fix your second case.


Edit after a request for a more general mechanism

Using inheritance and metaclasses, you can push the boundaries of dataclass_transform a bit further. The special thing about dataclass_transform is that is allows you to annotate something which you normally cannot annotate: it allows static type checkers to infer that methods are generated dynamically, with a (@dataclass compatible) signature.

If you want all classes to have some common functionality, you can use inheritance in stead of a class decorator, like in the example below:

@typing.dataclass_transform()
class BaseClass:

    def shared_method(self):
        print('This method is shared by all subclasses!')


class Vec(BaseClass):
    x: int 
    y: int 

However, this is of course fairly limited. You probably want to add dynamically generated methods to your class. Luckily, we can do this too. However, we will need metaclass for this.

Consider the following example:

@typing.dataclass_transform()
class Metaclass(type):

    def __new__(cls,
                name: str, 
                bases: tuple[type],
                class_dict: dict[str, typing.Any],
                **kwargs: typing.Any):
        self = super().__new__(cls, name, bases, class_dict, **kwargs)
        annotations: dict[str, type] = getattr(self, '__annotations__', {})
        if annotations:
            squares = '+'.join(
                f'self.{name}**2' 
                for name, data_type in annotations.items() 
                if issubclass(data_type, numbers.Number)
            )
            source = f'def length(self): return math.sqrt({squares})'
            namespace = {}
            exec(source, globals(), namespace)
            setattr(self, 'length', namespace['length'])
        # You can also generate your own __init__ method here.
        # Sadly, PyLance does not agree with this line.
        # I do not know how to fix this.
        return dataclass(frozen=True)(self) 
    
class BaseClass(metaclass=Metaclass):

    def length(self) -> float:
        return 0.0


class Vec(BaseClass):
    x: int 
    y: int 


print(Vec(1, 2).length())

The metaclass scans all annotations, and generates a method length. It assumes that all its child classes are vectors, where all fields annotated with a numerical type are entries of the vector. The generated length method then uses these entries to compute the length of the vector.

The trouble we now face is how to make sure the PyLance knows that classes using this metaclass have a length method which returns a float. To achieve this, we first define a base class using this metaclass, which has a correctly annotated length method. All your vector classes can now inherit from this baseclass. They will have their length method generated automatically, and PyLance is happy too.

There are still limitations to this. You cannot generate methods with a dynamic signature. (e.g. you cannot generate a method def change_values(self, x: int, y: int)). This is because there is simply no way to annotate that in the child class. That is part of the magic of dataclass_transform: the ability to annotate a dynamic signature for the __init__ method.

Upvotes: 5

Related Questions