zzabaa
zzabaa

Reputation: 86

Snakemake input and output according to a dictionary

I am trying to rename some files in the snakemake pipeline. Let's say I have three files: "FileA.txt", "FileB.txt", "FileC.txt" and I want them renamed according to a dictionary dict = {"A": "0", "B": "1", "C": "2"} to get "RenamedFile0.txt", "RenamedFile1.txt", and "RenamedFile2.txt". How would one write a rule for this?

This is how my pipeline looks like (I've tried with a function but doesn't work)

SAMPLES = ["A", "B", "C"]
RENAMED_SAMPLES = ["0", "1", "2"]

rename = {"0": "A", "1": "B", "2": "C"}

def mapFile(wildcards):
    file = "results/EditedFile" + str(rename[wildcards]) + ".txt"
    return(file)

rule all:
    input:
        "results/Combined.txt"

rule cut:
    input:
        "data/File{sample}.txt"
    output:
        "results/EditedFile{sample}.txt"
    shell:
        "cut -f1 {input} > {output}"

rule rename:
    input:
        mapFile
    output:
        "results/RenamedFile{renamedSample}.txt"
    shell:
        "cp {input} {output}"


rule combine:
    input:
        expand("results/RenamedFile{renamedSample}.txt", renamedSample = RENAMED_SAMPLES)
    output:
        "results/Combined.txt"
    shell:
        "cat {input} > {output}"

I get the following error:

KeyError: ['2']
Wildcards:
renamedSample=2

Thanks!!!

Upvotes: 3

Views: 803

Answers (1)

SultanOrazbayev
SultanOrazbayev

Reputation: 16581

When running a custom expansion, the names of wildcards should be specified:

def mapFile(wildcards):
    file = "results/EditedFile" + rename[wildcards.renamedSample] + ".txt"
    return(file)

In this specific case, it's also possible to integrate the logic in the rule itself:

rule rename:
    input:
        lambda wildcards: f"results/EditedFile{rename[wildcards.renamedSample]}.txt"
    output:
        "results/RenamedFile{renamedSample}.txt"
    shell:
        "cp {input} {output}"

Upvotes: 4

Related Questions