user443854
user443854

Reputation: 7455

How do I create luigi dependency graph but do not run anything?

Use case: some tasks are long batch jobs that take hours, need to review what completed and what failed for a given date before deciding which date to rerun first.

How to view the dependency graph generated by the central scheduler while not running anything? I do realize that I can simply rerun the graph for a given date and (assuming nothing changed) it will fail at exactly the same place(s) as last run, and I will be able to see the graph in the scheduler. Suppose, a task takes a long time before it fails. Is there something like a --dry-run argument?

I could also create an empty "toggle switch" task that would fail or complete based on input argument. However, I would need to remember to make every task depend on it - easy to overlook (can be solved by subclassing) but also creating clutter.

Any better options to consider?

Edit:

It looks like I can get what I need by setting --workers=0 when calling luigi. This results in the following message:

Did not run any tasks
This progress looks :| because there were tasks that were not granted run permission by the scheduler

Nothing was run, and I get my graph. Seems like a useful hack to document here.

Upvotes: 5

Views: 1581

Answers (1)

dlstadther
dlstadther

Reputation: 405

Running with --workers=0 works for normal dependency graphs. However, if you utilize dynamic dependencies, those graph nodes will be ignored (dynamic dependencies are Tasks yielded within a run().

A future alternative is a WIP PR for a Static DAG Visualizer. However, I suspect that dynamic dependencies will still be ignored given the necessity for run() execution.

Upvotes: 3

Related Questions