Reputation: 335
I need to run two rules (gatk_Mutect2
and gatk_IndelRealigner
) in the same snakefile.
If put these rules in different snakefiles, I can run them without error.
I use two input functions (get_files_somatic
and get_files
). Both use the case name as dictionary key. (Each case have a normal).
When I put these rules in the same snakefile, snakemake tries to find the id of the normal on the input of gatk_IndelRealigner
.
My question is: How can manage the ambiguity of two rules? I mean I want snakemake not try to connect these two rules.
def get_files_somatic(wildcards):
case = wildcards.case
control = aCondition[case][0]
return ["{}.sorted.dup.reca.cleaned.bam".format(case),"{}.sorted.dup.reca.cleaned.bam".format(control)]
rule all:
input: expand("{sample}.sorted.dup.reca.cleaned.bam",sample=create_tumor()),
expand("Results/vcf/{case}.vcf",case=create_tumor()),
include_prefix="rules"
include:
include_prefix + "/gatk2.rules"
include:
include_prefix + "/mutec2.rules"
rule gatk_Mutect2:
input: get_files_somatic,
output: "Results/vcf/{case}.vcf",
params:
log: "logs/{case}.mutect2.log"
threads: 8
shell:
rule gatk_IndelRealigner:
input:
get_files,
output:
"{case}.sorted.dup.reca.cleaned.bam",
"{case}.sorted.dup.reca.cleaned.bai",
params:
log:
"mapped_reads/merged_samples/logs/{case}_indel_realign_2.log"
threads: 8
shell:
def get_files(wildcards):
case = wildcards.case
control = aCondition[case][0]
wildcards.control = control
return ["mapped_reads/merged_samples/{}.sorted.dup.reca.bam".format(case), "mapped_reads/merged_samples/{}.sorted.dup.reca.bam".format(control),"mapped_reads/merged_samples/operation/{}_{}.realign.intervals".format(case,control)]
Upvotes: 0
Views: 178
Reputation: 8184
I'm not sure I really understood your problem. For instance, I don't get what you mean by "Each case have a normal".
But I can see that the output of gatk_IndelRealigner
("{case}.sorted.dup.reca.cleaned.bam"
) happens to be the same file name as one of the results of get_files_somatic
("{}.sorted.dup.reca.cleaned.bam".format(case)
, where case
is wildcards.case
).
That is the reason why gatk_Mutect2
gets "connected" to gatk_IndelRealigner
.
It is the essence of snakemake to connect rules based on matching file names between their input and output.
If you do not want to have these two rules linked, you need to have different file names.
Upvotes: 1