Reputation: 23
Goal: Take input and process it through:
Input: sample 1: sample1_R1_.fastq sample1_R2_.fastq sample 2: sample2_R1_.fastq sample2_R2_.fastq
Processes: each process contains a:
publishDir "${params.outdir}/<name>", mode: "copy"
Where the is either, "sickle", "fastqc", or "multiqc".
Output I am getting now:
I have the following code:
workflow {
SICKLE( reads )
fastqc_ch = FASTQC(reads, threads)
sickle_fastqc_ch = SICKLE_FASTQC ( SICKLE.out.reads_trimmed , threads )
fastqc_output = fastqc_ch.collect()
sickle_fastqc_output = sickle_fastqc_ch.collect()
combined_output = fastqc_output.merge(sickle_fastqc_output)
MULTIQC( combined_output )
}
Upvotes: 2
Views: 630
Reputation: 54502
I think the trick is to combine the FastQC and Sickle log files prior to calling collect
. You can use the mix
operator for this, for example using Conda:
Contents of main.nf
:
params.reads = '/path/to/fastqs/*_R{1,2}.fastq.gz'
params.multiqc_config = './assets/multiqc_config.yaml'
include { FASTQC as FASTQC_RAW } from './modules/fastqc'
include { FASTQC as FASTQC_TRIMMED } from './modules/fastqc'
include { SICKLE_PE } from './modules/sickle'
include { MULTIQC } from './modules/multiqc'
workflow {
reads = Channel.fromFilePairs( params.reads )
multiqc_config = file( params.multiqc_config )
FASTQC_RAW( reads )
SICKLE_PE( reads )
FASTQC_TRIMMED( SICKLE_PE.out.trimmed )
Channel.empty()
.mix( FASTQC_RAW.out )
.mix( SICKLE_PE.out.log )
.mix( FASTQC_TRIMMED.out )
.map { sample, files -> files }
.collect()
.set { log_files }
MULTIQC( log_files, multiqc_config )
}
Contents of ./modules/fastqc/main.nf
:
process FASTQC {
tag { sample }
input:
tuple val(sample), path(reads)
output:
tuple val(sample), path("*_fastqc.{zip,html}")
"""
fastqc -q ${reads}
"""
}
Contents of ./modules/sickle/main.nf
:
process SICKLE_PE {
tag { sample }
input:
tuple val(sample), path(reads, stageAs: 'reads/*')
output:
tuple val(sample), path("*.trimmed.fastq.gz"), emit: trimmed
tuple val(sample), path("${sample}.singles.fastq.gz"), emit: singles
tuple val(sample), path("${sample}.log"), emit: log
script:
def (fq1, fq2) = reads
"""
sickle pe \\
-t sanger \\
-g \\
-f "${fq1}" \\
-r "${fq2}" \\
-o "${sample}_R1.trimmed.fastq.gz" \\
-p "${sample}_R2.trimmed.fastq.gz" \\
-s "${sample}.singles.fastq.gz" \\
1> "${sample}.log"
"""
}
Contents of ./modules/multiqc/main.nf
:
process MULTIQC {
input:
path 'logs/*'
path config
output:
path "multiqc_report.html", emit: html
path "multiqc_data", emit: data
"""
multiqc \\
--config "${config}" \\
.
"""
}
Contents of ./nextflow.config
:
params {
outdir = './results'
}
process {
withName: FASTQC {
publishDir = [
path: "${params.outdir}/fastqc",
mode: 'copy',
]
cpus = 1
conda = 'fastqc=0.12.1'
}
withName: SICKLE_PE {
publishDir = [
path: "${params.outdir}/sickle",
mode: 'copy',
]
cpus = 1
conda = 'sickle-trim=1.33'
}
withName: MULTIQC {
publishDir = [
path: "${params.outdir}/multiqc",
mode: 'copy',
]
cpus = 1
conda = 'multiqc=1.14'
}
}
conda {
enabled = true
}
Contents of ./assets/multiqc_config.yaml
:
module_order:
- fastqc:
name: 'FastQC (raw)'
anchor: 'fastqc-raw'
target: 'FastQC'
path_filters_exclude:
- './logs/*.trimmed_fastqc.zip'
- sickle
- fastqc:
name: 'FastQC (trimmed)'
anchor: 'fastqc-trimmed'
target: 'FastQC'
path_filters:
- './logs/*.trimmed_fastqc.zip'
run_modules:
- fastqc
- sickle
plots_force_interactive: True
show_analysis_time: False
show_analysis_paths: False
Results:
$ nextflow run main.nf -ansi-log false
N E X T F L O W ~ version 23.04.1
Launching `main.nf` [distraught_euler] DSL2 - revision: 971e2c9d1f
Creating env using conda: fastqc=0.12.1 [cache /path/to/work/conda/env-d3b12ea84164cc521e82b56dc7f119d9]
Creating env using conda: sickle-trim=1.33 [cache /path/to/work/conda/env-72d5fea3bee2c2c7bb1951c0356c97fa]
[d2/302df1] Submitted process > SICKLE_PE (sample2)
[11/13a1f3] Submitted process > SICKLE_PE (sample1)
[ce/f8d7b9] Submitted process > SICKLE_PE (sample3)
[6a/0588fc] Submitted process > FASTQC_RAW (sample3)
[3a/deabf3] Submitted process > FASTQC_RAW (sample1)
[95/e2ddb3] Submitted process > FASTQC_RAW (sample2)
[dd/39b166] Submitted process > FASTQC_TRIMMED (sample2)
[45/bdefdc] Submitted process > FASTQC_TRIMMED (sample3)
[21/c15ebb] Submitted process > FASTQC_TRIMMED (sample1)
Creating env using conda: multiqc=1.14 [cache /path/to/work/conda/env-39798d385be8fa0f1dce9354302302f0]
[4b/45310d] Submitted process > MULTIQC
Upvotes: 1