Assa Yeroslaviz
Assa Yeroslaviz

Reputation: 764

How to assign multiple paths from the config.yaml?

I would like to use snakemake to analyze my data sets. As I am going to work with different organisms, I would like snakemake to create a folder for each of them when indexing the genome.

I have created the following structure in my config file

organism:
  Dmel:
    fasta: "ftp://ftp.ensembl.org/pub/current_fasta/drosophila_melanogaster/dna/Drosophila_melanogaster.BDGP6.22.dna.toplevel.fa.gz"
    gtf: "ftp://ftp.ensembl.org/pub/current_gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP6.22.98.gtf.gz"
  Dpse:
    fasta: "ftp://ftp.ensemblgenomes.org/pub/current/metazoa/fasta/drosophila_pseudoobscura/dna/Drosophila_pseudoobscura.Dpse_3.0.dna.toplevel.fa.gz"
    gtf: "ftp://ftp.ensemblgenomes.org/pub/current/metazoa/gtf/drosophila_pseudoobscura/Drosophila_pseudoobscura.Dpse_3.0.45.gtf.gz"

and would like to try and call this links in my rule star_index in my snakemake file, which is like that:

rule star_index:
   input:
          fasta="genome/{org}.fa",
          gtf="genome/{org}.gtf"
   output:
          directory("genome/{org}/starIndex/")
   threads: 16
   params:
         prefix = lambda wildcards: "genome/{org}/starIndex".format(org=wildcards.organism) ## wildcards.organism # {config['organism']}
   shell:
          "mkdir -p {output} && "
          "STAR --runThreadN {threads} "
          "--outFileNamePrefix {output} "
          "--runMode genomeGenerate "
          "--genomeDir {output} "
          "--limitGenomeGenerateRAM {config[RAM]} "
          "--genomeSAindexNbases {config[SAindex]} "
          "--genomeFastaFiles {input.fasta} "
          "--sjdbGTFfile {input.gtf} "
          "--sjdbOverhang 100"

But there is an error with the wildcards I can't figure out.

I get the following error, when running this rule:

InputFunctionException in line 51 of /local/Assa/projects/automation/P135.automation/getGenome_IndexGenome.Snakefile: AttributeError: 'Wildcards' object has no attribute 'organism' Wildcards: org=Dmel

I know that the problem is in the params element, because when I comment these two line out, the script would be able to run.

What I don't understand is why the wildcards.organism is not defined.

I would appreciate any hints or ideas. thanks Assa

Upvotes: 0

Views: 432

Answers (1)

Maarten-vd-Sande
Maarten-vd-Sande

Reputation: 3701

AttributeError: 'Wildcards' object has no attribute 'organism' 
Wildcards: org=Dmel

Seems like you spel out organism fully, whilst the wildcard is org.

params:
         prefix = lambda wildcards: "genome/{org}/starIndex".format(org=wildcards.org)

Upvotes: 1

Related Questions