Reputation: 187
Lets say i have about 5 tasks for 5 different period, outputting one excel file each. I will then need to merge these 5 output files in a new task, but one of the tasks has not been completed, but i still want the rest of these 4 files to be merged to one single file. Is there a way of doing this in Luigi. Here is some sample code that might help understand the question
class MakeFile():
period = luigi.Parameter()
def run(self):
return cleaned_file
class MergeFiles():
def requires(self):
periods = #mutiple periods
for period in periods:
yield MakeFile(period)
def run(self):
#merge files here
Upvotes: 0
Views: 187
Reputation: 2447
To do what you want, you can write nothing to your output
. Basically, Luigi checks that a task is complete if all the things returned by the output
method of a task exists. So, you could just open and close the excel files without writing anything and then testing if they are empty in MergeFiles
.
Beyond that, you have made a couple of mistakes in your current classes.
In MakeFile
, you don't return anything from run
. You need to create an output
method and return targets. See https://luigi.readthedocs.io/en/stable/tasks.html#task-output for more details.
In the requires
method of MergeFiles
, you don't yield
in the requires method. The yield
function is used when you are running a task and need to dynamically require additional tasks. If that is actually what you need, you can read more here: https://luigi.readthedocs.io/en/stable/tasks.html#dynamic-dependencies. I think you should just use return [MakeFile(period) for period in periods]
in your requires
. Then you can access them in run by using self.input()
.
Upvotes: 1