Dimitris
Dimitris

Reputation: 41

Conda environment is not activated in snakemake for rules declared as local

I have an issue with Snakemake v5.26.1 and I am not sure whether this is a bug or whether I am doing something wrong.

I have the following rule:

rule multiqc:
input:
    expand('results/{project}/fastqc/{sample}/{sample}_R{idx}_fastqc.{ext}', project=PROJECT, sample=SAMPLES, idx=[1,2], ext=['html', 'zip'])
output:
    html = 'results/{project}/fastqc/multiqc_report.html'
params:
    out = lambda wildcards, output: os.path.dirname(output['html'])
conda:
    'envs/multiqc.yaml'
shell:
    'multiqc --force -o {params.out} {params.out}'

where multiqc.yaml is the specification of a conda environment with multiqc installed.

When I execute the pipeline locally using, for example, snakemake --profile profiles/local/ everything works as expected. This is also the case, when I execute the code on the cloud using tibanna: snakemake --profile profiles/aws/. So far, so good.

The above rule is very lightweight and it makes sense to execute it locally, so I have the declaration localrules: all, multiqc somewhere in my Snakefile and this is where the problems begin. In this case, when I execute the code in the cloud, it seems that the conda environment defined in multiqc.yaml is not automatically installed or activated and I get a command 'multiqc' not found error. The hack I am currently using to bypass this problem is to install multiqc locally, but this solution is ugly and it affects the portability of the pipeline. I suspect I could solve the problem by using a singularity image with multiqc installed, but still the most elegant solution would be to somehow make the conda: 'envs/multiqc.yaml' part of the rule work.

Is the above behaviour expected or is it a bug (either in my code or snakemake)? What would be the best solution to resolve this issue (using a singularity image, maybe)?

Many thanks in advance.

Upvotes: 4

Views: 747

Answers (1)

dariober
dariober

Reputation: 9062

A couple of thoughts:

  • For the conda environment to be activated you need to add the option --use-conda in snakemake, I.e. snakemake --use-conda ...

  • Do you really need to execute multiqc in its own conda environment? If you can install and run multiqc locally without clashes with the other snakemake rules, I suspect you don't need a dedicated environment for it.

  • If all of the above fails, I would rather run multiqc as a job for the cluster rather than as a local rule to avoid the convoluted solutions you mention.

Upvotes: 1

Related Questions