user4249654
user4249654

Reputation: 273

Snakemake: rule generate strange results

I create this rule:

rule picard_addRG2:
    input:
        "mapped_reads/merged_samples/{sample}.dedup.bam"
    output:
        "mapped_reads/merged_samples/{sample}_rg.dedup.bam"
    params:
        sample_idi = config['samples'],
        library = "library00"

    shell:
        """picard  AddOrReplaceReadGroups I={input} O={output}  RGID={params.sample_id}  RGLB={params.library} RGPL=illumina  RGPU=unit1 RGSM=20 RGPU=MP"""

I add o Snakemake file this rule:

expand("mapped_reads/merged_samples/{sample}_rg.dedup.bam",sample=config['samples'])

I found this strange result on another rule:

snakemake --configfile exome.yaml -np 
InputFunctionException in line 17 of /illumina/runs/FASTQ/test_play/rules/samfiles.rules:
KeyError: '445_rg'
Wildcards:
sample=445_rg

What I did wrong?

If I change the rule in this way works perfectly:

rule picard_addRG2:
    input:
        "mapped_reads/merged_samples/{sample}.dedup.bam"
    output:
        "mapped_reads/merged_samples/{sample}.dedup_rg.bam"
    params:
        sample_id = config['samples'],
        library = "library00"

    shell:
        """picard  AddOrReplaceReadGroups I={input} O={output}  RGID={params.sample_id}  RGLB={params.library} RGPL=illumina  RGPU=unit1 RGSM=20 RGPU=MP"""

Upvotes: 0

Views: 231

Answers (1)

Eric C.
Eric C.

Reputation: 3368

Since it works perfectly with the second way to write the output, I would suggest to use this one. What's happening is the following:

since in your rule picard the input is:
"mapped_reads/merged_samples/{sample}.dedup.bam"
you must have a rule that creates this file as output. and in your rule picard the output is: "mapped_reads/merged_samples/{sample}_rg.dedup.bam"

So when you ask in your expand:
"mapped_reads/merged_samples/{sample}_rg.dedup.bam"
snakemake does not know if it has to use your rule picard with sample as the wildcard or your other rule with sample_rg as the wildcard since they both end and begin with the same pattern.

To resume: try not to use two outputs with a wildcard that can be expanded. Here both you outputs:
"mapped_reads/merged_samples/{sample}.dedup.bam"
"mapped_reads/merged_samples/{sample}_rg.dedup.bam"
begin and end with exactly the same pattern.

When you use: "mapped_reads/merged_samples/{sample}.dedup_rg.bam"
as output, the wildcard cannot be expanded!

Upvotes: 1

Related Questions