Reputation: 1577
I am trying to write a class that takes data where the dataframe IDs as strings and the values as DataFrames and create class attributes accessing the data.
I was able to write a small example of a similar class that needs the methods to be created in a static manner and return the objects as class methods but I would like to loop over the data, taking in the keys for the df
s and allow for access to each df
using attributes.
minimum working example
from dataclasses import dataclass
import pandas as pd
# re-writing as dataclass
@dataclass
class Dataset:
# data container dictionary as class attribute
dict = {'df1_id':pd.DataFrame({'col1':[1,1]}),
'df2_id':pd.DataFrame({'col2':[2,2]}),
'df3_id':pd.DataFrame({'col3':[3,3]})}
def df1_id(self) -> pd.DataFrame:# class method to create as class attribute
return dict['df1_id']
def df2_id(self) -> pd.DataFrame:# same class method above
return dict['df2_id']
def df3_id(self) -> pd.DataFrame:# same class method above
return dict['df3_id']
def dataframes_as_class_attributes(self):
# store the dfs to access as class attributes
# replacing 3 methods above
return
result
datasets = Dataset()
print(datasets.df1_id())
expected result
datasets = Dataset()
print(datasets.df1_id) # class attribute created by looping through the dict object
Edit:
Upvotes: 1
Views: 1206
Reputation: 5513
taking in the keys for the
df
s and allow for access to eachdf
using attributes.
It seems that the only purpose of the class is to have attribute access syntax. In that case, it would be simpler to just create a namespace object.
from types import SimpleNamespace
class Dataset(SimpleNamespace):
pass
# extend it possibly
data = {
'df1_id':pd.DataFrame({'col1':[1,1]}),
'df2_id':pd.DataFrame({'col2':[2,2]}),
'df3_id':pd.DataFrame({'col3':[3,3]})
}
datasets = Dataset(**data)
Output:
>>> datasets.df1_id
col1
0 1
1 1
>>> datasets.df2_id
col2
0 2
1 2
>>> datasets.df3_id
col3
0 3
1 3
Upvotes: 1
Reputation: 51
You could use setattr
like below:
from dataclasses import dataclass
import pandas as pd
@dataclass
class Dataset:
dict_ = {'df1_id':pd.DataFrame({'col1':[1,1]}),
'df2_id':pd.DataFrame({'col2':[2,2]}),
'df3_id':pd.DataFrame({'col3':[3,3]})}
def __post_init__(self):
for key, val in self.dict_.items():
setattr(self, key, val)
To avoid conflicts with python keywords put a single trailing underscore after variable name. (PEP 8)
Upvotes: 1