Reputation: 800
Every Luigi task has two basic methods - among others:
The documentation is not clear regarding the following: What happens to the tasks defined under requires
if the output
is already there? To put it more concretely, let's create a dummy task
class DummyTask(luigi.Task):
def run(self):
pass
def requires(self):
yield TaskA()
yield Taskb()
def output(self):
return luigi.LocalTarget('foo.txt')
Will Luigi check the existence of foo.txt and then proceed with fulfilling the tasks under requires
or will it firstly fulfill all the tasks under requires
and then check whether the output
exists so that is can actually run the run
method?
Upvotes: 0
Views: 1148
Reputation: 800
The answer is that Luigi will NOT run any of the tasks under requires
if the scheduler finds that the target
exists. I created the following snippet which showcases that behavior:
import luigi
from luigi import Task
import logging
from pathlib import Path
class Dummy(Task):
logger = logging.getLogger('Dummy')
output_path = '/tmp/dummy_task_output.txt'
def run(self):
self.logger.info('Running dummy task')
Path(self.output_path).touch()
def requires(self):
yield TaskA()
yield TaskB()
def output(self):
return luigi.LocalTarget(self.output_path)
class TaskA(Task):
logger = logging.getLogger('TaskA')
task_complete = False
def run(self):
self.logger.info('Running TaskA')
self.task_complete = True
def complete(self):
return self.task_complete
class TaskB(Task):
logger = logging.getLogger('TaskB')
task_complete = False
def run(self):
self.logger.info('Running TaskB')
self.task_complete = True
def complete(self):
return self.task_complete
You can run it with PYTHONPATH='.' luigi --module dependency_test Dummy --local-scheduler
The output on the first run:
===== Luigi Execution Summary =====
Scheduled 3 tasks of which:
* 3 ran successfully:
- 1 Dummy()
- 1 TaskA()
- 1 TaskB()
This progress looks :) because there were no failed tasks or missing dependencies
===== Luigi Execution Summary =====
while the output the second run - since the output /tmp/dummy_task_output.txt
exists
===== Luigi Execution Summary =====
Scheduled 1 tasks of which:
* 1 complete ones were encountered:
- 1 Dummy()
Did not run any tasks
This progress looks :) because there were no failed tasks or missing dependencies
===== Luigi Execution Summary =====
Upvotes: 1