viralrna
viralrna

Reputation: 63

STAR snakemake wrapper with configfile

I'm trying to use the STAR wrapper for snakemake (described here) but would like to include an option for an annotation file using the --sjdbGTFfile flag. My goal is to specify a gtf annotation file by referring to a configfile, but I can't seem to get it to work.

Relevant snippet of my config file:

annotations: ref_files/cal09_human/GRCh38.103_Cal09.gtf

The relevant rule looks like this:

rule map_reads:
    message:
        """
        Mapping trimmed reads from {wildcards.sample} to host genome
        """
    input:
        fq1 = "tmp/{sample}_R1_trimmed.fastq.gz",
        fq2 = "tmp/{sample}_R2_trimmed.fastq.gz"
    params:
        index= config["genome_index"],
        extra="--sjdbGTFfile {config[annotations]} \
            --sjdbOverhang 149 \
            --outFilterType BySJout \
            --outFilterMultimapNmax 10 \
            --alignSJoverhangMin 5 \
            --alignSJDBoverhangMin 1 \
            --outFilterMismatchNmax 999 \
            --outFilterMismatchNoverReadLmax 0.04 \
            --alignIntronMin 20 \
            --alignIntronMax 1000000 \
            --alignMatesGapMax 1000000 \
            --outFilterIntronMotifs RemoveNoncanonicalUnannotated \
            --outSAMtype BAM SortedByCoordinate \
            --runMode alignReads"
    output:
        "star_mapping/{sample}_Aligned.sortedByCoord.out.bam"
    log:
        "logs/star/{sample}.log"
    threads: 20
    wrapper:
        "0.74.0/bio/star/align"

I was specifying the config file as if it were within a shell command (like described here), but it's not working:

Jun 03 16:49:52 ..... started STAR run
Jun 03 16:49:52 ..... loading genome
Jun 03 16:50:32 ..... processing annotations GTF

FATAL error, could not open file pGe.sjdbGTFfile={config[annotations]}

Jun 03 16:50:32 ...... FATAL ERROR, exiting

I ran the following directly in the terminal and it runs just fine:

STAR --runThreadN 24 \
 --genomeDir ref_files/cal09_human_star_index \
 --sjdbGTFfile ref_files/cal09_human/GRCh38.103_Cal09.gtf \
 --sjdbOverhang 149 \
 --outFilterType BySJout \
 --outFilterMultimapNmax 10 \
 --alignSJoverhangMin 5 \
 --alignSJDBoverhangMin 1 \
 --outFilterMismatchNmax 999 \
 --outFilterMismatchNoverReadLmax 0.04 \
 --alignIntronMin 20 \
 --alignIntronMax 1000000 \
 --alignMatesGapMax 1000000 \
 --outFilterIntronMotifs RemoveNoncanonicalUnannotated \
 --outSAMtype BAM SortedByCoordinate \
 --runMode alignReads

I double checked the I didn't make an obvious mistake like misspecifying the file path in the config file. Does anyone know of a way to achieve what I'm trying to do?

Upvotes: 0

Views: 191

Answers (1)

dariober
dariober

Reputation: 9062

extra="--sjdbGTFfile {config[annotations]} \
            --sjdbOverhang 149 ..."

It seems {config[annotations]} is not replaced with the dictionary value but it is instead passed as it is.

If you are on python 3.6+, try using (note the f)

extra=f"--sjdbGTFfile {config[annotations]} \
            --sjdbOverhang 149 ..."

or, old style:

extra= "--sjdbGTFfile %s \
            --sjdbOverhang 149 ..." % config[annotations]

Upvotes: 2

Related Questions