Mayank
Mayank

Reputation: 23

How to not instantiate dataclass members as class variables in Python?

By default, it seems that Python considers the class members as ClassVar. Although from the documentation: https://docs.python.org/3/library/dataclasses.html#class-variables

One of two places where dataclass() actually inspects the type of a field is to determine if a field is a class variable as defined in PEP 526. It does this by checking if the type of the field is typing.ClassVar. If a field is a ClassVar, it is excluded from consideration as a field and is ignored by the dataclass mechanisms. Such ClassVar pseudo-fields are not returned by the module-level fields() function.

A minimal reproduction of the issue:

from dataclasses import dataclass
import numpy as np

@dataclass
class Color:
    r: int = 0
    g: int = 0
    b: int = 0

@dataclass
class NumberColor:
    color: Color = Color()
    number: int = 0
    alpha = np.zeros(3)

x = NumberColor()
y = NumberColor()

print("Id of x:", id(x))
print("Id of y:", id(y))
print('-----------')
print("Id of x.color:", id(x.color))
print("Id of y.color:", id(y.color))
print('-----------')
print("Id of x.number:", id(x.number))
print("Id of y.number:", id(y.number))
print('-----------')
print("Id of x.alpha:", id(x.alpha))
print("Id of y.alpha:", id(y.alpha))
print('-----------')

x.color.r = 255
x.number = 10

print(x.__dict__)
print(y.__dict__)

Yields:

Id of x: 140660354709392
Id of y: 140660357249008
-----------
Id of x.color: 140660355994096
Id of y.color: 140660355994096
-----------
Id of x.number: 9788960
Id of y.number: 9788960
-----------
Id of x.alpha: 140660289932624
Id of y.alpha: 140660289932624
-----------
{'color': Color(r=255, g=0, b=0), 'number': 10}
{'color': Color(r=255, g=0, b=0), 'number': 0}

Upvotes: 2

Views: 899

Answers (1)

Blckknght
Blckknght

Reputation: 104712

No, the color attribute of your NumberColor class isn't being considered a class variable. Rather, it's a shared default argument to the __init__ method that's being created for your class by the dataclass decorator.

It's not very much different than any other issue with a mutable default argument:

def foo(c = Color()): # default value is created here
    return c

x = foo()    # get the default value
print(x.r)   # 0, as is expected
x.r = 255

y = foo()    # get the default again
print(y.r)   # 255, not 0 as you might naively expect

In your code, the equivalent to x and y are the x.color and y.color attributes.

To avoid this issue, you can assign a dataclasses.field object to the name in the class body with a default_factory argument that is the Color class (you'll also want to do something similar for the numpy array used as the default for alpha):

from dataclasses import dataclass, field

@dataclass
class NumberColor:
    color: Color = field(default_factory=Color)         # use a field!
    number: int = 0
    alpha = field(default_factory=lambda: np.zeros(3))  # here too

There's no need to use field for number (or for the attributes in Color) because int objects are immutable, and you shouldn't care if the same object is being used for any given 0 value.

Upvotes: 3

Related Questions