Reputation: 81
Is it possible to have optionally empty wildcards? It seems like it was possible a few years ago (https://groups.google.com/g/snakemake/c/S7fTL4jAYIM), but the described method didn't work for a user last year and now is not working for me.
My Snakefile looks something like this (abbreviated for clarity):
wildcard_constraints:
udn_id="ID.+",
compound="(no_)*compound(_genome|_exome)*"
rule all:
input: expand("file/path/{id}/{compound}{.*}.html",
id=[config["id"]], compound=compound_list, freq=freq_list)
rule create_html:
output: "file/path/{id}/{compound}{freq,.*}.html"
input: "/oak/stanford/groups/euan/UDN/output/AnnotSV/AnnotSV_3.0.5/{udn_id}/WGS_blood_"+hg+"/gateway_hpo/{udn_id}.{comp_het}{cohort_freq,.*}.annotated.tsv"
shell: #Run shell commands
rule append_freq:
output: "file/path/{id}/{compound}.ha_freq.tsv"
input: "file/path/{id}/{compound}.tsv"
script: "file/path/get_ha_freq.py"
I get the error
No values given for wildcard ''.
File file/path, line 6 in <module>
when I run this.
I also tried implementing a wildcard constraint like this:
wildcard_constraints:
udn_id="ID.+",
compound="(no_)*compound(_genome|_exome)*"
freq=".*"
rule all:
input: expand("file/path/{id}/{compound}{freq}.html",
id=[config["id"]], compound=compound_list, freq=freq_list)
rule create_html:
output: "file/path/{id}/{compound}{freq}.html"
input: "/oak/stanford/groups/euan/UDN/output/AnnotSV/AnnotSV_3.0.5/{udn_id}/WGS_blood_"+hg+"/gateway_hpo/{udn_id}.{comp_het}{cohort_freq}.annotated.tsv"
shell: #Run shell commands
rule append_freq:
output: "file/path/{id}/{compound}.ha_freq.tsv"
input: "file/path/{id}/{compound}.tsv"
script: "file/path/get_ha_freq.py"
but I received the error,
No values given for wildcard 'freq'.
File file/path, line 7 in <module>
when I did this.
What error am I making?
Upvotes: 2
Views: 191
Reputation: 421
It is still possible to have empty wildcards if you update the wildcard_constraints
as described in your link. Here's a short example:
#!/usr/bin/env snakemake
wildcard_constraints:
sample=".*"
samples = ["a", "b", "", "d"]
rule all:
input:
"collected/all_samples.txt"
rule process_samples:
output:
"processed/{sample}.txt"
shell:
"""
echo '{wildcards.sample}' > {output}
"""
rule collect_samples:
input:
processed = ["processed/{}.txt".format(sample) for sample in samples]
output:
"collected/all_samples.txt"
shell:
"""
cat {input.processed} > {output}
"""
After saving this file as example.smk
I can run the pipeline:
$ snakemake --cores 1 -s example.smk
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
--------------- ------- ------------- -------------
all 1 1 1
collect_samples 1 1 1
process_samples 4 1 1
total 6 1 1
[Wed Aug 25 20:43:24 2021]
rule process_samples:
output: processed/a.txt
jobid: 2
wildcards: sample=a
resources: tmpdir=/tmp
[Wed Aug 25 20:43:24 2021]
Finished job 2.
1 of 6 steps (17%) done
[Wed Aug 25 20:43:24 2021]
rule process_samples:
output: processed/.txt
jobid: 4
wildcards: sample=
resources: tmpdir=/tmp
... (skipping some output) ...
[Wed Aug 25 20:43:25 2021]
localrule all:
input: collected/all_samples.txt
jobid: 0
resources: tmpdir=/tmp
[Wed Aug 25 20:43:25 2021]
Finished job 0.
6 of 6 steps (100%) done
The empty wildcard output file processed.txt
is created at the empty wildcard (line) is present in the collected/all_samples.txt
file.
$ ls processed/.txt
processed/.txt
$ cat collected/all_samples.txt
a
b
d
Upvotes: 1