Ben Kovitz
Ben Kovitz

Reputation: 5020

Conflict between mix-ins for abstract dataclasses

1. A problem with dataclass mix-ins, solved

To make abstract dataclasses that type-check under mypy, I've been breaking them into two classes, one that contains the abstract methods and one that contains the data members, as explained in this answer. The abstract class inherits from the dataclass. This runs into a problem, though, when another abstract-class-and-dataclass pair inherits from the first one: the "ancestor" dataclass's fields get wiped out by the "descendant". For example:

from dataclasses import dataclass
from abc import ABC, abstractmethod

@dataclass
class ADataclassMixin:
    a_field: int = 1

class A(ADataclassMixin, ABC):
    @abstractmethod
    def method(self):
        pass

@dataclass
#class BDataclassMixin(A):  # works  but fails mypy 0.931 type-check
class BDataclassMixin:  # fails
    b_field: int = 2
    pass

class B(BDataclassMixin, A):
    def method(self):
        return self

o = B(a_field=5)

The last line fails, yielding this error message:

TypeError: BDataclassMixin.__init__() got an unexpected keyword argument 'a_field'

B's method-resolution order (B.__mro__) is (B, BDataclassMixin, A, ADataclassMixin, ABC, object), as expected. But a_field is not found.

A solution, shown in the commented-out line above, is to put the ancestor class explicitly in the descendant dataclass's declaration: class BDataclassMixin(A) instead of class BDataclassMixin. This fails type-checking, though, because a dataclass can only be a concrete class.

2. A problem with that solution, unsolved

The above solution breaks down if we add a third class, inheriting from B:

@dataclass
#class CDataclassMixin:  # fails
class CDataclassMixin(A):  # fails
#class CDataclassMixin(B, A):  # works  but fails type-check
    c_field: int = 3
    pass

class C(CDataclassMixin, B):
    def method(self):
        return "C's result"
    pass

o = C(b_field=5)

Now, C has a_field and c_field but has lost b_field.

I have found that if I declare CDataclassMixin explicitly to inherit from B and A (in that order), b_field will be in the resulting class along with a_field_ and c_field`. However, explicitly stating the inheritance hierarchy in every mix-in defeats the purpose of mix-ins, which is to be able to code them independently of all the other mix-ins and to mix them easily and any way you like.

What is the correct way to make abstract dataclass mix-ins, so that classes that inherit from them include all the dataclass fields?

Upvotes: 3

Views: 1646

Answers (2)

hussic
hussic

Reputation: 1920

Putting the mixin as the last base class works without error:

@dataclass
class ADataclassMixin:
    a_field: int = 1


class A(ABC, ADataclassMixin):

    @abstractmethod
    def method(self):
        pass


@dataclass
class BDataclassMixin:  
    b_field: int = 2


class B(A, BDataclassMixin):

    def method(self):
        return self


o = B(a_field=5)
print((o.a_field, o.b_field))  # (5,2)

Upvotes: 1

Ben Kovitz
Ben Kovitz

Reputation: 5020

The correct solution is to abandon the DataclassMixin classes and simply make the abstract classes into dataclasses, like this:

@dataclass  # type: ignore[misc]
class A(ABC):
    a_field: int = 1

    @abstractmethod
    def method(self):
        pass

@dataclass  # type: ignore[misc]
class B(A):
    b_field: int = 2

@dataclass
class C(B):
    c_field: int = 3

    def method(self):
        return self

The reason for the failures is that, as explained in the documentation on dataclasses, the complete set of fields in a dataclass is determined when the dataclass is compiled, not when it is inherited from. The internal code that generates the dataclass's __init__ function can only examine the MRO of the dataclass as it is declared on its own, not when mixed in to another class.

It's necessary to add # type: ignore[misc] to each abstract dataclass's @dataclass line, not because the solution is wrong but because mypy is wrong. It is mypy, not Python, that requires dataclasses to be concrete. As explained by ilevkivskyi in mypy issue 5374, the problem is that mypy wants a dataclass to be a Type object and for every Type object to be capable of being instantiated. This is a known problem and awaits a resolution.

The behavior in the question and in the solution is exactly how dataclasses should behave. And, happily, abstract dataclasses that inherit this way (the ordinary way) can be mixed into other classes willy-nilly no differently than other mix-ins.

Upvotes: 4

Related Questions