Reputation: 47
I've been trying to run the following bioinformatic script:
configfile: "config.yaml"
WORK_TRIM = config["WORK_TRIM"]
WORK_KALL = config["WORK_KALL"]
rule all:
input:
expand(WORK_KALL + "quant_result_{condition}", condition=config["conditions"])
rule kallisto_quant:
input:
fq1 = WORK_TRIM + "{sample}_1_trim.fastq.gz",
fq2 = WORK_TRIM + "{sample}_2_trim.fastq.gz",
idx = WORK_KALL + "Homo_sapiens.GRCh38.cdna.all.fa.index"
output:
WORK_KALL + "quant_result_{condition}"
shell:
"kallisto quant -i {input.idx} -o {output} {input.fq1} {input.fq2}"
However, I keep obtaing an error like this:
WildcardError in line 13 of /home/user/directory/Snakefile:
Wildcards in input files cannot be determined from output files:
'sample'
Just to explain briefly, kallisto quant
will produce 3 outputs: abundance.h5
, abundance.tsv
and run_injo.json
. Each of those files need to be sent to their own newly created condition
directory. I not getting exactly what is going on wrong. I'll appreciated any help on this.
Upvotes: 1
Views: 81
Reputation: 9062
If you think about it, you are not giving snakemake enough information.
Say "condition" is either "control" or "treated" with samples "C" and "T", respectively. You need to tell snakemake about the association control: C, treated: T
. You could do this using functions-as-input files or lambda functions. For example:
cond2samp = {'control': 'C', 'treated': 'T'}
rule all:
input:
expand("quant_result_{condition}", condition=cond2samp.keys())
rule kallisto_quant:
input:
fq1 = lambda wc: "%s_1_trim.fastq.gz" % cond2samp[wc.condition],
fq2 = lambda wc: "%s_2_trim.fastq.gz" % cond2samp[wc.condition],
idx = "Homo_sapiens.GRCh38.cdna.all.fa.index"
output:
"quant_result_{condition}"
shell:
"kallisto quant -i {input.idx} -o {output} {input.fq1} {input.fq2}"
Upvotes: 2