Reputation: 515
I'm just trying to get myself familiar with dataclass in python. One thing I learned from some readings online is that, we can turn the regular class definition with a mutable class variable (which is a bad thing), into dataclass and that would prevent it. For example:
regular class:
class A:
a = []
def __init__(self):
self.b = 1
this could have potential issue where different instances share the same class variable a
, and modify a
unknowingly.
and with dataclass:
@dataclass
class A:
a: list = []
def __init__(self):
self.b = 1
this does not allow me to write this class by raising error:
ValueError: mutable default <class 'list'> for field a is not allowed: use default_factory
however, if I simply get rid of the type annotation:
@dataclass
class A:
a = []
def __init__(self):
self.b = 1
there is no complaint at all and a
is still shared across different instances.
Is this expected?
How come the simple type annotation would change the behavior of the class variable?
(I'm using python 3.7.6
)
Upvotes: 3
Views: 2045
Reputation: 454
When you declare
@dataclass
class A:
a = []
def __init__(self):
self.b = 1
a
is not a dataclass field. REF: https://github.com/ericvsmith/dataclasses/issues/2#issuecomment-302987864
You can take a look at __dataclass_fields__
and __annotations__
fields after declaring the class.
In [55]: @dataclass
...: class A:
...: a: list = field(default_factory=list)
...:
...: def __init__(self):
...: self.b = 1
...:
In [56]: A.__dict__
Out[56]:
mappingproxy({'__module__': '__main__',
'__annotations__': {'a': list},
'__init__': <function __main__.A.__init__(self)>,
'__dict__': <attribute '__dict__' of 'A' objects>,
'__weakref__': <attribute '__weakref__' of 'A' objects>,
'__doc__': 'A()',
'__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False),
'__dataclass_fields__': {'a': Field(name='a',type=<class 'list'>,default=<dataclasses._MISSING_TYPE object at 0x7f8a27ada250>,default_factory=<class 'list'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)},
'__repr__': <function dataclasses.__repr__(self)>,
'__eq__': <function dataclasses.__eq__(self, other)>,
'__hash__': None})
In [57]: @dataclass
...: class A:
...: a = []
...:
...: def __init__(self):
...: self.b = 1
...:
In [58]: A.__dict__
Out[58]:
mappingproxy({'__module__': '__main__',
'a': [],
'__init__': <function __main__.A.__init__(self)>,
'__dict__': <attribute '__dict__' of 'A' objects>,
'__weakref__': <attribute '__weakref__' of 'A' objects>,
'__doc__': 'A()',
'__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False),
'__dataclass_fields__': {},
'__repr__': <function dataclasses.__repr__(self)>,
'__eq__': <function dataclasses.__eq__(self, other)>,
'__hash__': None})
From PEP 557:
The dataclass decorator examines the class to find fields. A field is defined as any variable identified in __annotations__
. That is, a variable that has a type annotation. REF: How to add a dataclass field without annotating the type?
Checks only happen on dataclass fields and not on class variables, Here is the check for field which is causing the error
if f._field_type is _FIELD and isinstance(f.default, (list, dict, set)):
Why mutable types are not allowed: https://docs.python.org/3/library/dataclasses.html#mutable-default-values
Upvotes: 4