Reputation: 377
I want to run multiple snakefiles called qc.smk
, dada2.smk
, picrust2.smk
using singularity. Then there is one snakefile called longitudinal.smk
I would like to run conditionally. For example, if longitudinal data is
being used.
# set vars
LONGITUDINAL = config['perform_longitudinal']
rule all:
input:
# fastqc output before trimming
raw_html = expand("{scratch}/fastqc/{sample}_{num}_fastqc.html", scratch = SCRATCH, sample=SAMPLE_SET, num=SET_NUMS),
raw_zip = expand("{scratch}/fastqc/{sample}_{num}_fastqc.zip", scratch = SCRATCH, sample=SAMPLE_SET, num=SET_NUMS),
raw_multi_html = SCRATCH + "/fastqc/raw_multiqc.html",
raw_multi_stats = SCRATCH + "/fastqc/raw_multiqc_general_stats.txt"
# there are many more files in rule all
##### setup singularity #####
singularity: "docker://continuumio/miniconda3"
##### load rules #####
include: "rules/qc.smk"
include: "rules/dada2.smk"
include: "rules/phylogeny.smk"
include: "rules/picrust2.smk"
if LONGITUDINAL == 'yes':
include: 'rules/longitudinal.smk'
print("Will perform a longitudinal analysis")
else:
print("no longitudinal analysis")
The code above works only if I am running a longitudinal dataset. However, when I am not running the longitudinal analysis snakemake fails and says something like:
MissingInputException in line 70 of /mnt/c/Users/noahs/projects/tagseq-qiime2-snakemake-1/Snakefile:
Missing input files for rule all:
I think if I was able to add a similar conditional statement like the one I have for my external snakefile snakemake would not freak out about me not including the longitudinal snakefile.
Upvotes: 2
Views: 5432
Reputation: 377
Solution for merging list form expand statement:
I used a configuration file to pass the statements to the Snakefile
## Config.yml ##
# longitudinal analysis
perform_longitudinal: 'yes' # yes for longitudinal analysis
When 'yes' is entered in the configuration Snakemake will include additional variables in rule all and run an addition Snakefile to generate these files. There ended up being multiple Snakefiles so I used singularity to let Snakemake know that the rule all input files were for all 6 Snakefiles.
## Snakefile ##
configfile: "config.yaml"
LONGITUDINAL = config['perform_longitudinal']
# rule all input files
raw_html=file.txt,
raw_zip=file.txt,
raw_multi_htmt=file.txt,
raw_multi_stats=file.txt,
Longitudinal_analaysis_files=file.txt
# rule all files excluding longitudinal analysis
rule_all_input_list=['raw_html','raw_zip','raw_multi_htmt','raw_multi_stats']
#longitudinal analysis files
rule_all_longitudinal_input=['Longitudinal_analaysis_files']
if LONGITUDINAL == 'yes':
rule_all_input_list.extend(rule_all_longitudinal_input)
# conditionally add Snakefile to workflow
include: 'rules/longitudinal.smk'
print("Will perform a longitudinal analysis")
else:
print("no longitudinal analysis")
rule all:
input:
data = rule_all_input_list
##### setup singularity #####
# this container defines the underlying OS for each job when using the workflow
# with --use-conda --use-singularity
singularity: "docker://continuumio/miniconda3"
##### load rules #####
include: "rules/qc.smk"
include: "rules/dada2.smk"
include: "rules/phylogeny.smk"
include: "rules/picrust2.smk"
include: "rules/differential.smk"
I have a less simplified version of how I got this working on GitHub https://github.com/nasiegel88/tagseq-qiime2-snakemake-1
Upvotes: 1
Reputation: 3701
You can define a list (or dict) of what you want as output outside of the rule all
, and feed that to the input, something like this works:
myoutput = list()
if condition_1 == True:
myoutput.append("file_1.txt")
if condition_2 == True:
myoutput.append("file_2.txt")
rule all:
input:
myoutput
edit:
Either place myoutput
as first in the input of rule all:
rule all:
input:
myoutput,
raw_html = "raw_html_path",
raw_zip = "raw_zip_path"
or make it named, and place it wherever:
rule all:
input:
raw_html = "raw_html_path",
myoutput = myoutput,
raw_zip = "raw_zip_path"
In Python (and snakemake) named positional arguments always go before named arguments.
Upvotes: 4