Reputation: 41
I have an issue with Snakemake v5.26.1 and I am not sure whether this is a bug or whether I am doing something wrong.
I have the following rule:
rule multiqc:
input:
expand('results/{project}/fastqc/{sample}/{sample}_R{idx}_fastqc.{ext}', project=PROJECT, sample=SAMPLES, idx=[1,2], ext=['html', 'zip'])
output:
html = 'results/{project}/fastqc/multiqc_report.html'
params:
out = lambda wildcards, output: os.path.dirname(output['html'])
conda:
'envs/multiqc.yaml'
shell:
'multiqc --force -o {params.out} {params.out}'
where multiqc.yaml
is the specification of a conda
environment with multiqc
installed.
When I execute the pipeline locally using, for example, snakemake --profile profiles/local/
everything works as expected. This is also the case, when I execute the code on the cloud using tibanna
: snakemake --profile profiles/aws/
. So far, so good.
The above rule is very lightweight and it makes sense to execute it locally, so I have the declaration localrules: all, multiqc
somewhere in my Snakefile
and this is where the problems begin. In this case, when I execute the code in the cloud, it seems that the conda
environment defined in multiqc.yaml
is not automatically installed or activated and I get a command 'multiqc' not found
error. The hack I am currently using to bypass this problem is to install multiqc
locally, but this solution is ugly and it affects the portability of the pipeline. I suspect I could solve the problem by using a singularity
image with multiqc
installed, but still the most elegant solution would be to somehow make the conda: 'envs/multiqc.yaml'
part of the rule work.
Is the above behaviour expected or is it a bug (either in my code or snakemake
)? What would be the best solution to resolve this issue (using a singularity
image, maybe)?
Many thanks in advance.
Upvotes: 4
Views: 747
Reputation: 9062
A couple of thoughts:
For the conda environment to be activated you need to add the option --use-conda
in snakemake, I.e. snakemake --use-conda ...
Do you really need to execute multiqc in its own conda environment? If you can install and run multiqc locally without clashes with the other snakemake rules, I suspect you don't need a dedicated environment for it.
If all of the above fails, I would rather run multiqc as a job for the cluster rather than as a local rule to avoid the convoluted solutions you mention.
Upvotes: 1