fsan
fsan

Reputation: 118

Dataclass - why attributes are treated differently from normal classes?

I understand dataclass as a decorator for creating __init__ and __repr__ and other functions 'automatically'.

But I noticed something that is unexpected to me and I would know if this is something expected because I could not find anything related in the official documentation (not that I related at least)

First example:

from dataclasses import dataclass

class CustomObj():
    def __init__(self, x):
        self.x = x
        print(f'called custom obj with {x}')

class Normal():
    i : int
    o : CustomObj
    f : float = 100.
    s : str = 'this is a string'

@dataclass
class Data():
    i : int
    o : CustomObj
    f : float = 100.
    s : str = 'this is a string'


object_1 = Normal()
object_2 = Data(i = 1., o = CustomObj('custom_from_2'))

try:
    object_1.i
except AttributeError:
    print('it is ok, detecting an expected attribute error')

assert object_2.i == 1.
print('it is ok, because dataclass makes us set i value')

assert object_1.f == object_2.f
print('it is ok, because it is an "native" value')

try:
    object_1.o
except AttributeError:
    print('it is ok, detecting an expected attribute error, we didnt set o for object_1')

assert Normal.f == Data.f
print('it is ok, both classes have the same values and we did mess with it yet')

object_1.o = CustomObj('custom_from_1')
print('we set a customObj for obj_1 here')

Normal.f = 222.
Data.f = 222.

assert Normal.f == Data.f
print('it is ok, we set both values to same thing')

assert object_1.f == 222 and object_2.f == 100.
print(f'1: {object_1.f} 2: {object_2.f}')
print('By setting Normal.f we set object_1.f to 222 but object_2.f still 100')

object_1.s = 'changing object_1.s to something else'
object_2.s = 'changing object_2.s to something else'

Normal.s = 'changing Normal.s to something else'
Data.s = 'changing Data.s to something else'

print(object_1.s, object_2.s)
print(Normal.s, Data.s)

object_3 = Normal()
object_4 = Data(i = 4., o = CustomObj('custom_from_4'))

assert object_3.s == 'changing Normal.s to something else'
print('it is expected to have new value for the class definitions of Normal here')

print(f'Normal.s: {Normal.s}')
print(f'Data.s: {Data.s}')
print(f'object_1.s: {object_1.s}')
print(f'object_2.s: {object_2.s}')
print(f'object_3.s: {object_3.s}')
print(f'object_4.s: {object_4.s}')

output is:

called custom obj with custom_from_2
it is ok, detecting an expected attribute error
it is ok, because dataclass makes us set i value
it is ok, because it is an "native" value
it is ok, detecting an expected attribute error, we didnt set o for object_1
it is ok, both classes have the same values and we did mess with it yet
called custom obj with custom_from_1
we set a customObj for obj_1 here
it is ok, we set both values to same thing
1: 222.0 2: 100.0
By setting Normal.f we set object_1.f to 222 but object_2.f still 100
changing object_1.s to something else changing object_2.s to something else
changing Normal.s to something else changing Data.s to something else
called custom obj with custom_from_4
it is expected to have new value for the class definitions of Normal here
Normal.s: changing Normal.s to something else
Data.s: changing Data.s to something else
object_1.s: changing object_1.s to something else
object_2.s: changing object_2.s to something else
object_3.s: changing Normal.s to something else
object_4.s: this is a string

My three questions are:

  1. Why changing Normal.i is different from changing Data.i when checking object_1.i and object_2.i
  2. Why Data.s has changed but not object_4.s
  3. Where this behavior is state in the documentation?

My guess is that using the decorator makes something around giving back a reference to a kind of __new__ operator and changing the values at the definition is supposed to be this way. But I could not find in the docs where it is stated so I am confused.

Does anybody have some clue?

Upvotes: 1

Views: 293

Answers (1)

Mia
Mia

Reputation: 2676

In a nutshell, the @dataclass decorator transforms the definition of the class by extracting variables from the type annotations. The best way to understand what's going on when you can't find the documentation is by looking at the source code.

We can first go to the definition of dataclass and see that it returns a class processed by _process_class(). Inside the function, you can find that it gives a new initializer to the class being decorated, which is basically what you have guessed.

As @juanpa.arrivillaga has pointed out, the reason why your Normal.i is different from Data.i is because Data.i is, by @dataclass, an object attribute while your Normal.i is a class attribute. This is also why setting Data.s has no effect on your object_4.s.

Lastly, this behavior is not elaborated too much inside the docs itself but in the linked PEP557, where it states the exact effects of adding @dataclass.

Upvotes: 3

Related Questions