Reputation: 25
I have a luigi task which reads a .sql file and outputs to BigQuery.
My question is there any way I can reuse that same task with a different .sql file without having to copy the whole luigi task, i.e. I want to create instances of a template luigi task.
class run_sql(luigi.task):
sql_file = 'path/to/sql/file' # This is the only bit of code that changes
def complete(self):
...
def requires(self):
...
def run(self):
...
Upvotes: 1
Views: 369
Reputation: 408
Building off of @matagus' answer, you can also subclass RunSql
to define a sql file, using the complete()
, requires()
, and run()
methods of the parent class.
class RunSqlFile(RunSql):
sql_file = '/path/to/file.sql`
Or you can use the @property
decorator to reference attributes of the RunSql
class. I often do this to set a directory, or other configuration data, in the parent class, then reference them in subclasses.
class RunSql(luigi.Task):
sql_file = luigi.Parameter()
def get_file(self, name):
default_dir = '/path/to/sql/dir'
return os.path.join(default_dir, name)
def requires(self):
...
class RunSqlFile(RunTask):
@property
def sql_file(self):
return self.get_file("query.sql")
And that will act as if you'd instantiated the class with --sql-file /path/to/sql/dir/query.sql
Upvotes: 1
Reputation: 6206
Just use a parameter to specify the path to the file. Something like this:
class RunSql(luigi.task):
sql_file = luigi.Parameter()
def complete(self):
...
def requires(self):
...
def run(self):
...
In order to access the value of the param just use self.sql_file
in your code.
After that you may run your task this way:
luigi RunSql --sql-file path/to/file.sql
Upvotes: 1