Reputation: 11
I'm new to trying out snakemake (last week or so) in order to handle less of the small details for workflows, previously I have coded up my own specific workflow through python.
I generated a small workflow which among the steps would use Illumina PE reads and ran Kraken against them. I'd then parse the output of the Kraken output to detect the most common species (within a set of allowable) if a species value wasn't provided (running with snakemake -s test.snake --config R1_reads= R2_reads= species=''.
I have 2 questions.
Currently my strategy for this is to create a temp file which
contains the detected species and then cat {input.species}
it into
other shell commands. This doesn't seem elegant but looking through
the docs I couldn't quite find an adequate alternative. I noticed
PersistentDicts would let me pass variables between run: commands
but I'm unsure if I can use that to load variables into a shell:
section. I also noticed that wrappers could allow me to handle it
however from the point I need that variable on I'd be wrapping the
remainder of my workflow.
Right now my impression on how to solve this problem is to have multiple workflow files for the species and have a run with switch which calls the associated species workflow dependant on the species.
Appreciate any insight on these questions.
-Kim
Upvotes: 1
Views: 381
Reputation: 1927
You can mark output as dynamic (e.g. expecting one file per species). Then, Snakemake will determine the downstream DAG of jobs after those files have been generated. See http://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#dynamic-files
Upvotes: 0