Reputation: 13
rule all:
input:
"../data/A_checkm/{genome}"
rule A_checkm:
input:
"../data/genomesFna/{genome}_genomic.fna.gz"
output:
directory("../data/A_checkm/{genome}")
threads:
16
resources:
mem_mb = 40000
shell:
"""
# setup a tmp working dir
tmp=$(mktemp -d)
mkdir $tmp/ref
cp {input} $tmp/ref/genome.fna.gz
cd $tmp/ref
gunzip -c genome.fna.gz > genome.fna
cd $tmp
# run checking
checkm lineage_wf -t {threads} -x fna ref out > stdout
# prepare output folder
cd {config[project_root]}
mkdir -p {output}
# copy results over
cp -r $tmp/out/* {output}/
cp $tmp/stdout {output}/checkm.txt
# cleanup
rm -rf $tmp
"""
Thank you in advance for your help! I would like to run checkm on a list of ~600 downloaded genome files having the extension '.fna.gz'. Each downloaded file is saved in a separate folder having the same name as the genome. I would like also to have all the results in a separate folder for each genome and that's why my output is a directory. When I run this code with 'snakemake -s Snakefile --cores 10 A_checkm', I get the following error:
WorkflowError: Target rules may not contain wildcards. Please specify concrete files or a rule without wildcards at the command line, or have a rule without wildcards at the very top of your workflow (e.g. the typical "rule all" which just collects all results you want to generate in the end).
Anyone could help me identifying the error, please?
Upvotes: 1
Views: 1472
Reputation: 1277
You need to provide snakemake
with concrete values for the {genome}
wildcard. You cannot just leave it open and expect snakemake
to work on all the files in some folder of your project just like that.
glob_wildcards(...)
. See the documentation for further details.rule all
to create all the folders (using your other rule) with those {genome}
values:# Determine the {genome} for all downloaded files
(GENOMES,) = glob_wildcards("../data/genomesFna/{genome}_genomic.fna.gz")
rule all:
input:
expand("../data/A_checkm/{genome}", genome=GENOMES),
rule A_checkm:
input:
"../data/genomesFna/{genome}_genomic.fna.gz",
output:
directory("../data/A_checkm/{genome}"),
threads: 16
resources:
mem_mb=40000,
shell:
# Your magic goes here
If the download is supposed to happen inside snakemake, add a checkpoint
for that. Have a look at this answer then.
Upvotes: 4