Reputation: 35
I am converting a bash pipeline to process paired whole-exome sequenced tumor-normal samples into a SnakeMake workflow.
Paired samples are listed in my config file, as follows:
sample_list:
- sample: 1
tumor: AO1_04_RN_1_T_4_S4
control: AO2_07_C007558T1Wa_S37
- sample: 2
tumor: AO2_01_C007589T1FTa_S2
control: AO2_07_C007589T1Wa_S34
- sample: 3
tumor: AO9_09_FM_1_T_7_S13
control: AO2_07_C007558T1Wa_S37
I am now stuck in getting a rule to call variants with Mutect2 to process the tumor samples from sample X specifically with its control sample (e.g. for sample 1: -I tumor_of_sample_1 -I control_of_sample_1
).
What I keep getting (even using zip) is a command built like this: -I tumor_of_sample_1 tumor_of_sample_2 tumor_of_sample_3 -I control_of_sample_1 control_of_sample_2 control_of_sample_3
, which inevitably fails.
I would be very grateful for some help...thanks!
Upvotes: 0
Views: 56
Reputation: 588
I think what you need is a simple lambda function to grab the correct config entry for a given wildcard.I changed your config file a bit, which makes the lambda function a bit simpler, otherwise it should be trivial to adjust the function accordingly (I think, have not tested it)
sample_list:
sample1:
tumor: sample1_tumor.gz
control: sample1_control.gz
sample2:
tumor: sample2_tumor.gz
control: sample2_control.gz
sample3:
tumor: sample2_tumor.gz
control: sample2_control.gz
And the conresponding snakefile:
configfile: "config.yml"
rule all:
input:
expand("out/{ID}.procesed",ID=["sample1","sample2","sample3"])
rule test:
input:
tumor=lambda wc: "input/"+config["sample_list"][wc.ID]["tumor"],
control=lambda wc: "input/"+config["sample_list"][wc.ID]["control"]
output:
"out/{ID}.procesed"
shell:
"echo {input.tumor} {input.control} {output}"
Upvotes: 2