Lei Hao
Lei Hao

Reputation: 799

What is the difference between `cached_property` and `field(init=False)` in Python dataclass?

I have a dataset which needs to be loaded from database. I'm wondering what is the difference between the following two ways of handling it.

import pandas as pd
from dataclasses import dataclass, field

@dataclass
class A:
    df: pd.DataFrame = field(init=False)

    def load_df(self):
        self.df = query_from_database()

and

import pandas as pd
from dataclasses import dataclass, field
from functools import cached_property

@dataclass
class A:
    
    @cached_property
    def df(self):
        df = query_from_database()
        return df

Upvotes: 0

Views: 42

Answers (2)

sahasrara62
sahasrara62

Reputation: 11237

in method 2:

you once intialised, it is hard to de-initalise without distorying instance, more better suited if want to access frequent if not much change, data is stored in cache, larger data will be saved in cache which make it memory intensive

in method 1: more control on access / reload, data stored in memory, can reload data without distorying instance

Upvotes: 0

user28633938
user28633938

Reputation: 1

import pandas as pd
import numpy as np
from dataclasses import dataclass, field
from functools import cached_property

def query_from_database():
    print("query_from_database")
    return pd.DataFrame(np.zeros((3, 4)))

class A:
    df: pd.DataFrame = field(init=False)

    def load_df(self):
        self.df = query_from_database()

class B:
    @cached_property
    def df(self):
        df = query_from_database()
        return df


if __name__ == '__main__':
    a = A()
    a.load_df()
    a.load_df()
    # print 2 times of 'query_from_database'

    # With @cached_property, the function name becomes the property of class B, same as self.df.
    # When u use b.df, the function body of df(self) will be executed. 
    # After the first time of df(self) involved, the data of df will be cached.
    # If use b.df (i.e. involve df(self)) again, the function body will not be executed.
    # And the cached property, which same as self.df, will be returned directly.
    b = B()
    b.df
    b.df
    # print only 1 time of 'query_from_database'

Upvotes: 0

Related Questions