huajun
huajun

Reputation: 31

Is it possible to return a rule by a function in Python Snakemake?

I am building a pipeline that consists of 4 iterations of three rules. Instead of copying and pasting the three rules 4 times and slightly changing the variables in each rule, is it possible to have a rule function, which takes in the variables that need to be changed, and return a rule?

I couldn't find the relevant documentation on the Internet.

rule A:
    input: 
        OUTPUT_DIR + "/XR/{pop}.pull"
    output: 
        OUTPUT_DIR + "/XR/{pop}.nemo"
    shell:"./scriptA {input} {output}"

rule B:
    input: 
        OUTPUT_DIR + "/XR/{pop}.pull",
        OUTPUT_DIR + "/XR/{pop}.nemo",
    output: 
        OUTPUT_DIR + "/XR/{pop}.freqs"
    shell:"./scriptB {input[0]} {input[1]} {output}"

rule C:
    input: 
        OUTPUT_DIR + "/XR/{pop}.pull",
        OUTPUT_DIR + "/XR/{pop}.freqs",
    output: 
        OUTPUT_DIR + "/XR/{pop}.impute"
    shell:"./scriptC {input[0]} {input[1]} {output}"

The above code is one iteration. What if I want to do a second iteration, with some slight changes (say, only changes the XR folder to XRQ folder)?

Upvotes: 2

Views: 80

Answers (1)

Dmitry Kuzminov
Dmitry Kuzminov

Reputation: 6604

If the only change needed in your pattern is the name of the folder, you may make it a wildcard: this is the most obvious and welcomed way in Snakemake. For example:

rule all:
    input: expand(OUTPUT_DIR + "/{folder}/{pop}.impute", folder=["XR", "XRQ"], pop=["pop1", "pop2", "pop3", "pop4"])

rule A:
    input: 
        OUTPUT_DIR + "/{folder}/{pop}.pull"
    output: 
        OUTPUT_DIR + "/{folder}/{pop}.nemo"
    shell:"./scriptA {input} {output}"

rule B:
    input: 
        OUTPUT_DIR + "/{folder}/{pop}.pull",
        OUTPUT_DIR + "/{folder}/{pop}.nemo",
    output: 
        OUTPUT_DIR + "/{folder}/{pop}.freqs"
    shell:"./scriptB {input[0]} {input[1]} {output}"

rule C:
    input: 
        OUTPUT_DIR + "/{folder}/{pop}.pull",
        OUTPUT_DIR + "/{folder}/{pop}.freqs",
    output: 
        OUTPUT_DIR + "/{folder}/{pop}.impute"
    shell:"./scriptC {input[0]} {input[1]} {output}"

Anyway, if the differences are more complex, please provide more details and show us the root rule: in the code you provided there is no single rule that has all the wildcards resolved.

Upvotes: 1

Related Questions