woostersauce
woostersauce

Reputation: 33

Snakemake expand using dictionary values

I have a dictionary with keys as patient IDs and a list of fastq files as values.

patient_samples = {
  "patientA": ["sample1", "sample2", "sample3"],
  "patientB": ["sample1", "sample4", "sample5", "sample6"]
}

I want to align each sample.fastq and output the aligned .bam file in a directory for each patient. The resulting directory structure I want is this:

├── patientA
│   ├── sample1.bam
│   ├── sample2.bam
│   ├── sample3.bam
├── patientB
│   ├── sample1.bam
│   ├── sample4.bam
│   ├── sample5.bam
│   ├── sample6.bam

Here I used lambda wildcards to get the samples for each patient using the "patient_samples" dictionary.

rule align:
    input:
        lambda wildcards: \
            ["{0}.fastq".format(sample_id) \ 
            for sample_id in patient_samples[wildcards.patient_id]
            ]
    output:
        {patient_id}/{sample_id}.bam"
    shell:
        ### Alignment command

How can I write the rule all to reflect that only certain samples are aligned for each patient? I have tried referencing the dictionary key to specify the samples:

rule all:
    input:
        expand("{patient_id}/{sample_id}.bam", patient_id=patient_samples.keys(), sample_id=patient_samples[patient_id])

However, this leads to a NameError: name 'patient_id' is not defined

Is there another way to do this?

Upvotes: 3

Views: 324

Answers (1)

SultanOrazbayev
SultanOrazbayev

Reputation: 16551

The error is because the expand command does not know what is the patient_id to use when listing the sample_id values:

expand(
   "{patient_id}/{sample_id}.bam",
   patient_id=patient_samples.keys(),
   sample_id=patient_samples[patient_id])
                                ^^^^^ Unknown

Using expand is convenient when you already have lists with wildcard values, in more complex cases it's best to use python:

list_inputs_all = [
   f"{patient_id}/{sample_id}.bam"
   for patient_id, sample_id
   in patient_samples.items()
]
   
rule all:
    input:
        list_inputs_all

Upvotes: 2

Related Questions