Reputation: 1338
I have a workflow that produces tons of files, most of them are not the output of any rule (they are intermediate results). I'd like to have the option of deleting everything that is not the output of any rule after the workflow is complete. This would be useful for archiving.
Right now the only way I found to do that is to define all outputs of all rules as protected, and then run snakemake --delete-all-output
. Two questions:
1. Is this the way to go, or is there a better solution?
2. Is there a way to automatically define all outputs as protected, or do I have to go through the entire code and wrap all outputs with protected()
?
Thanks!
Upvotes: 3
Views: 1611
Reputation: 2881
In addition to @dariober's suggestion, here's a few ideas:
temp()
, which will cause Snakemake to delete it automatically. You can combine this with --notemp
for debugging. With temp()
, deletion will happen progressively, not after the workflow is complete.onsuccess
hook defined by snakemake. From the docs, "The onsuccess
handler is executed if the workflow finished without error." So, say, if throughout the workflow, you put unneeded file in a temp/
folder or similar, you could use shutil.rmtree("temp")
in onsuccess
, which would delete all your unneeded files only after the workflow finished successfully, as you require. (Note also the similar onerror
, should you need it.)Upvotes: 2
Reputation: 9062
Maybe the option --list-untracked helps?
--list-untracked, --lu
List all files in the working directory that are not
used in the workflow. This can be used e.g. for
identifying leftover files. Hidden files and
directories are ignored.
Upvotes: 3