Reputation: 98048
First time using dataclass, also not really good at Python. The following behaviour conflicts with my understanding so far:
from dataclasses import dataclass
@dataclass
class X:
x: int = 1
y: int = 2
@dataclass
class Y:
c1: X = X(3, 4)
c2: X = X(5, 6)
n1 = Y()
n2 = Y()
print(id(n1.c1))
print(id(n2.c1))
n1.c1.x = 99999
print(n2)
This prints
140459664164272
140459664164272
Y(c1=X(x=99999, y=4), c2=X(x=5, y=6))
Why does c1 behave like a class variable? What can I do to keep n2.c1 != n1.c1
, do I need to write an init function?
I can get sensible results with this addition to Y:
def __init__(self):
self.c1 = X(3, 4)
self.c2 = X(5, 6)
prints:
140173334359840
140173335445072
Y(c1=X(x=3, y=4), c2=X(x=5, y=6))
Upvotes: 5
Views: 1738
Reputation: 16526
Why does
c1
behave like a class variable?
Because you specified default value for them and they're now a class attribute. In the Mutable Default Values section, it's mentioned:
Python stores default member variable values in class attributes.
But look at this:
@dataclass
class X:
x: int = 1
y: int = 2
@dataclass
class Y:
c1: X
c2: X = X(5, 6)
print("c1" in Y.__dict__) # False
print("c2" in Y.__dict__) # True
c1
doesn't have default value so it's not in class's namespace.
Indeed by doing so(defining default value), Python stores that c1
and c2
inside both instance's namespace (n1.__dict__
) and class's namespace (Y.__dict__
). Those are the same objects, only the reference is passed:
@dataclass
class X:
x: int = 1
y: int = 2
@dataclass
class Y:
c1: X = X(3, 4)
c2: X = X(5, 6)
n1 = Y()
n2 = Y()
print("c1" in Y.__dict__) # True
print("c1" in n1.__dict__) # True
print(id(n1.c1)) # 140037361903232
print(id(n2.c1)) # 140037361903232
print(id(Y.c1)) # 140037361903232
So now, If you want them to be different you have several options:
@dataclass
class X:
x: int = 1
y: int = 2
@dataclass
class Y:
c1: X = X(3, 4)
c2: X = X(5, 6)
n1 = Y(X(3, 4), X(5, 6))
n2 = Y(X(3, 4), X(5, 6))
print("c1" in Y.__dict__) # True
print("c1" in n1.__dict__) # True
print(id(n1.c1)) # 140058585069264
print(id(n2.c1)) # 140058584543104
print(id(Y.c1)) # 140058585065088
field
and pass default_factory
:from dataclasses import dataclass, field
@dataclass
class X:
x: int = 1
y: int = 2
@dataclass
class Y:
c1: X = field(default_factory=lambda: X(3, 4))
c2: X = field(default_factory=lambda: X(5, 6))
n1 = Y()
n2 = Y()
print("c1" in Y.__dict__) # False
print("c1" in n1.__dict__) # True
print(id(n1.c1)) # 140284815353136
print(id(n2.c1)) # 140284815353712
In the second option, because I didn't specify default
parameter(you can't mix both), nothing is going to be stored in the class's namespace. field(default=SOMETHING)
is another way of saying = SOMETHING
.
Upvotes: 5