Toffi
Toffi

Reputation: 41

snakemake different wildcards in input vs output

I'm trying to achieve the following: apply my workflow on set of chromosomes, each will have multiple methods. So the input will always be one file with one wildcard chr, while the output will have two chr & method. The problem is, it seems that

  1. if the output was {chr}_{method}_rslts.csv, snakemake expects the input to have these two wildcards
  2. if the output was {chr}_rslts_{method}.csv, snakemake complaines about missing input for the rule

could you please help

rule all:
    input:
       expand(config["outdir"] + "/{chr}_{method}_rslts.csv", chr = ["chr1", "chr2", "chr3"], method = ["method1", "method2"])

rule example_rule:
    input:
       config["indir"] + "/{chr}_fltrd.csv", 
    output:
       config["outdir"] + "/{chr}_{method}_rslts.csv",
    shell:
       """
        touch {output}
       """

Upvotes: 1

Views: 71

Answers (1)

Troy Comi
Troy Comi

Reputation: 2079

I am not finding any problems when the output has more wildcards than the input. In particular your example

snakemake --version  # kind of old :D
7.8.5
snakemake -nq
Building DAG of jobs...
Job stats:
job             count    min threads    max threads
------------  -------  -------------  -------------
all                 1              1              1
example_rule        6              1              1
total               7              1              1

and presumably in the real rule you use the value of method in the command.

For the other way around, more wildcards in inputs than outputs, you have to use an input function to set all input wildcards based on the output wildcards. Remember, snakemake determines what needs to be run "backwards", working from the requested outputs to the inputs so the outputs must have more wildcards than inputs.

A related problem is if you have multiple outputs, some of which don't use all wildcards. Say your rule was:

rule example_rule:
    input:
       config["indir"] + "/{chr}_fltrd.csv", 
    output:
       config["outdir"] + "/{chr}_{method}_rslts.csv",
       config["outdir"] + "/{chr}_summary.csv",

which is an error because the "common" file will collide on writes and cause untold issues. The error message explains this. Please clarify your question if this doesn't address it.

Upvotes: 1

Related Questions