Reputation: 157
I am making a Snakefile
for data analysis. the extension of my raw data is .RCC. for example the first input file I have is: CF30207_01.RCC.
and the script I am running on the data is QC.py
. Looking at the tutorial, I have made the following snakefile:
SAMPLES = ["CF30207_01",
"CF30212_06",
"CF30209_03",
"CF30213_07",
"CF30211_05",
"CF30214_08"]
rule all:
input:
expand('{sample}.RCC', sample=SAMPLES)
rule QC:
input:
rc = '/home/snakemaker/{sample}.RCC'
output:
'{sample}.pdf'
"quality_control.csv"
shell:
"python3 QC.py"
but I got the following errors:
./Snakefile: line 1: SAMPLES: command not found
./Snakefile: line 2: CF30212_06,: command not found
./Snakefile: line 3: CF30209_03,: command not found
./Snakefile: line 4: CF30213_07,: command not found
./Snakefile: line 5: CF30211_05,: command not found
./Snakefile: line 6: CF30214_08]: command not found
./Snakefile: line 8: rule: command not found
./Snakefile: line 9: input:: command not found
./Snakefile: line 10: syntax error near unexpected token `'{sample}.RCC','
./Snakefile: line 10: ` expand('{sample}.RCC', sample=SAMPLES)'
but I followed exactly the same structures. do you guys know how I can fix the problem is with this snakefile?
Upvotes: 0
Views: 426
Reputation: 2079
Welcome to snakemake! You have a good start, but couple of other notes on your snakefile.
rule all:
input:
expand('{sample}.RCC', sample=SAMPLES)
The rule all should request the final outputs of your workflow, not the inputs. These are the files you are requesting to be made. Change the input to:
expand('{sample}.pdf', sample=SAMPLES)
For the QC rule, it doesn't seem like you are passing the input/output files to the QC.py script. If you have command line arguments in that function, you can add them like:
"python3 QC.py --input {input.rc} --output {output[0]}"
Alternatively you can pass QC.py to the script directive and use snakemake.input[0]
, etc to access the files in your python code.
Within the output
output:
'{sample}.pdf'
"quality_control.csv"
You need to add a comma between the files to make them a list. Also note that every sample will output to the same quality_control.csv
. At best this will overwrite and only keep the last sample, if you have multithreading you may have an error in your python code. You may want something like:
output:
'{sample}.pdf',
'quality_control_{sample}.csv'
If your QC code actually appends to quality_control, you can instead force a single execution at a time for that rule with custom resources
A good test for new snakefiles is to run snakemake -nq
to make sure the file syntax is ok and you have the expected number of rules queued up.
Upvotes: 2
Reputation: 9062
I guess you are executing the snakefile script itself as ./Snakefile
. Instead, you should do
snakemake -s /path/to/Snakefile
Or just snakemake
if the Snakefile is in the current directory.
Upvotes: 2