Reputation: 1527
I have a machine learning algorithm which involves a series of steps, such as cleaning data, preparing training data etc. Each step is stored in a separate method of a python class. I'm wondering what the best practice way to structure my class is so that the steps are automatically executed upon class instantiation.
Here's what I've done (the code is illustrative, but this approach works on the real algorithm). It feels a little clunky. Is there a more elegant way?
class Kaggle():
"""
An algorithm
"""
def __init__( self ):
self.bar = 1
def step_one( self, some_text_data ):
self.bar = 1 ** 2
# Do some data cleaning
# return processed data
def step_two( self ):
foo = step_one(baz)
# do some more processing
def step_three( self ):
bar = step_two()
# output results
def run( self ):
self.step_one()
self.step_two()
self.step_three()
if __name__ == "__main__":
kaggle = Kaggle()
kaggle.run()
Upvotes: 6
Views: 10850
Reputation: 43
I was working with dataclasses and found this to work as well:
from dataclasses import dataclass
@dataclass
class MyClass:
var1: str
var2: str
def __post_init__(self):
self.func1()
self.func2()
def func1(self):
print(self.var1)
def func2(self):
print(self.var2)
a = MyClass('Hello', 'World!')
Hello
World!
Upvotes: 0
Reputation: 109528
If your goal is for the object to be "automatically executed upon class instantiation", just put self.run()
in __init__
:
def __init__(self):
self.bar = 1
self.run()
As an aside, one should try to keep the __init__
method lightweight and just use it to instantiate the object. Although "clunky", your original Kaggle class is how I would design it (i.e. instantiate the object and then have a separate run
method to run your pipeline). I might rename run
to run_pipeline
for better readability, but everything else looks good to me.
Upvotes: 11
Reputation: 21609
Put all the calls in your __init__
method. Is this not what you wanted to achieve? You could add a flag with a default value, that allows you to not run the tests if you want.
def __init__( self, runtests=True ):
self.bar = 1
if runtests:
self.step_one()
self.step_two()
self.step_three()
Upvotes: 3