Reputation: 1
I have a main Snakefile and several subworkflows running in independent subdirectories (with paths relative to their own directories). I've noticed that if I modify one of the input of a subworkflow, it will rerun correctly but all the following rules that come afterwards are not rerun.
If I understand correctly what is going on, there's a different DAG for the main Snakefile and for each subworkflow. The main DAG is not aware of any modification in a subworkflow and therefore won't trigger a rerun since the output of the subworkflow hasn't been modified yet.
I'd like that all the rules depending of the output of a subworkflow are rerun if there's a modification in that subworkflow. Isn't that what the default behaviour should be ?
I've also tried the other modularisation techniques. Using includes works but is super annoying because I have to modify all the paths to be relative to the main directory (and therefore I can't run snakemake independently in one subdirectory anymore). I've also tried using the new module system coming with snakemake v.6 that is supposed to be replacing subworkflows. Maybe I don't use it correctly, but it doesn't seem to work for my use case. If I import a rule from a subdirectory it complains that there are missing inputs. It doesn't find the scripts because they are in the subdirectory and not in the main directory. So in that sense it works more like an include than a subworkflow.
Do you have any idea on how to solve my issue ?
Here's a small working example with the module implementation:
MainDirectory
| - Snakefile
rule all:
input: "Subdirectory/file.txt"
module other_workflow:
snakefile: "Subdirectory/Snakefile"
use rule * from other_workflow as other_*
| - Subdirectory
| | - Snakefile
rule rule_a:
input:
script = 'code.py'
output: 'file.txt'
shell: 'python {input.script}'
| | - code.py
with open('file.txt', 'w') as f:
print('This is a test.', file=f)
This doesn't work as the snakefile in the main directory uses all the rules in the same workdir, whereas I would like it to be running the imported rules in their own workdir. I can make it work by modifying all the relative paths in the subdirectory but that's not what I want. I want to be able to run it without modifications.
Upvotes: 0
Views: 581
Reputation: 1927
The issue is that you mix input files with code here. If the script code.py
is defined as an input file, Snakemake expects it to be in the workdir. If you'd use the script directive or the Jupyter notebook directive instead, the path will be automatically relative to the Snakefile. If that should be not an option for whatever reason, you can instead build the path relative to the current Snakefile via Path(workflow.snakefile).parent / "code.py"
.
Note, there is really no reason to register code as input files. If you intend to get a rerun upon changes in the code, it is better to rely on snakemake --list-code-changes
. The reason Snakemake does not automatically trigger reruns upon code changes is that they can be just cosmetic (e.g. formatting). Hence, it is up to the dev to trigger the rerun, e.g. via --list-code-changes, or manually.
Upvotes: 1