bli
bli

Reputation: 8194

How to do a partial expand in Snakemake?

I'm trying to first generate 4 files, for the LETTERS x NUMS combinations, then summarize over the NUMS to obtain one file per element in LETTERS:

LETTERS = ["A", "B"]
NUMS = ["1", "2"]


rule all:
    input:
        expand("combined_{letter}.txt", letter=LETTERS)

rule generate_text:
    output:
        "text_{letter}_{num}.txt"
    shell:
        """
        echo "test" > {output}
        """

rule combine text:
    input:
        expand("text_{letter}_{num}.txt", num=NUMS)
    output:
        "combined_{letter}.txt"
    shell:
        """
        cat {input} > {output}
        """

Executing this snakefile results in the following error:

WildcardError in line 19 of /tmp/Snakefile:
No values given for wildcard 'letter'.
  File "/tmp/Snakefile", line 19, in <module>

It seems that partial expand is not possible. Is it a limitation of expand ? If so, how should I circumvent it ?

Upvotes: 8

Views: 3871

Answers (3)

Scholar
Scholar

Reputation: 512

Partial expand is possible using allow_missing=True.

For example:

expand("text_{letter}_{num}.txt", num=[1, 2], allow_missing=True)

> ["text_{letter}_1.txt", "text_{letter}_2.txt"]

Upvotes: 9

bli
bli

Reputation: 8194

Update (25/11/2020): As per this answer, partial expands are now possible without multi-bracketing, thanks to the allow_missing argument of expand.


It seems that this is not a limitation of expand, but a limitation of my familiarity with the way string-formatting works in python. I need to use double brackets for the non-expanded wildcard:

LETTERS = ["A", "B"]
NUMS = ["1", "2"]


rule all:
    input:
        expand("combined_{letter}.txt", letter=LETTERS)

rule generate_text:
    output:
        "text_{letter}_{num}.txt"
    shell:
        """
        echo "test" > {output}
        """

rule combine text:
    input:
        expand("text_{{letter}}_{num}.txt", num=NUMS)
    output:
        "combined_{letter}.txt"
    shell:
        """
        cat {input} > {output}
        """

Executing this snakefile now generates the expected following files:

text_A_2.txt
text_A_1.txt
text_B_2.txt
text_B_1.txt
combined_A.txt
combined_B.txt

Upvotes: 10

Johannes K&#246;ster
Johannes K&#246;ster

Reputation: 1927

Indeed, braces need to be escaped when you want to ignore them in expand. It relies on str.format, and hence any rules from format apply to expand as well.

Upvotes: 4

Related Questions