aerijman
aerijman

Reputation: 2782

channel from process output in nextflow

Can I generate a channel from a process output in Nextflow?

E.g.

process bam2fastq {
    conda 'samtools'

    input:
        path bam_file

    output:
        path 'fq1', emit: fastq1
        path 'fq2', emit: fastq2
        stdout // out[2]

    shell:
    '''
    samtools fastq !{bam_file} -1 fq1 -2 fq2
    echo !{bam_file} | awk -F"/" '{print $NF}' | sed 's/.bam//'
    '''    
} 

Then, in the workflow (dsl2)

files = Channel.fromPath( params.input_files )

reads = bam2fastq(files)
library = bam2fastq.out[2]


a = Channel.of(reads.fastq1, reads.fastq2, library)

Or ideally:

Channel.of(bam2fastq)

When I print the output variables from the process bam2fastq I get: DataflowBroadcast around DataflowStream[?].

Thank you

Upvotes: 1

Views: 1233

Answers (1)

Steve
Steve

Reputation: 54502

The output block already lets you to define the output channels of a process. I think a better approach here would be to pass in the library name using a tuple. You could also output a tuple so that you have a 'key' to more easily handle the outputs downstream. If the library name is just the basename of the BAM, you could use the fromFilePairs factory method to extract it for you. For example:

params.input_files = './path/to/bams/*.bam'


process bam2fastq {

    tag { library }

    input:
    tuple val(library), path(bam_file)

    output:
    tuple val(library), path("${library}.{1,2}.fastq.gz")

    """
    samtools fastq \\
        -1 "${library}.1.fastq.gz" \\
        -2 "${library}.2.fastq.gz" \\
        -0 /dev/null \\
        -s /dev/null \\
        -n \\
        "${bam_file}"
    """
}

workflow {

    input_ch = Channel.fromFilePairs( params.input_files, size: 1 )

    bam2fastq( input_ch )

    bam2fastq.out.view()
} 

With:

$ find ./path/to/bams/
./path/to/bams/
./path/to/bams/C.markdup.sorted.bam
./path/to/bams/A.markdup.sorted.bam
./path/to/bams/B.markdup.sorted.bam

Results:

$ nextflow run -ansi-log false main.nf 
N E X T F L O W  ~  version 23.04.1
Launching `main.nf` [mad_torvalds] DSL2 - revision: d6f550efc0
[be/7e76ac] Submitted process > bam2fastq (B.markdup.sorted)
[24/54381e] Submitted process > bam2fastq (C.markdup.sorted)
[8d/ddddeb] Submitted process > bam2fastq (A.markdup.sorted)
[C.markdup.sorted, [/path/to/work/24/54381e1d52ab8066150bdb1d6fe734/C.markdup.sorted.1.fastq.gz, /path/to/work/24/54381e1d52ab8066150bdb1d6fe734/C.markdup.sorted.2.fastq.gz]]
[B.markdup.sorted, [/path/to/work/be/7e76ac22e05a6caa2410299918e7e6/B.markdup.sorted.1.fastq.gz, /path/to/work/be/7e76ac22e05a6caa2410299918e7e6/B.markdup.sorted.2.fastq.gz]]
[A.markdup.sorted, [/path/to/work/8d/ddddeb7c3a1501191c99e8fafd7232/A.markdup.sorted.1.fastq.gz, /path/to/work/8d/ddddeb7c3a1501191c99e8fafd7232/A.markdup.sorted.2.fastq.gz]]

Upvotes: 1

Related Questions