Reputation: 1157
I have a tuple channel containing entries like:
SH7794_SA119138_S1_L001, [R1.fq.gz, R2.fq.gz]
And a csv split into 36 entries, each like:
[samplename:SH7794_SA119138_S1, mouseID:1-4, treat:vehicle, dose:NA, time:day18, tgroup:vehicle__day18, fastqsuffix:_L001_R1_001.fastq.gz, bamsuffix:_Filtered.bam, trim:fulgentTrim, species:human, host:mouse, outlier:NA, RIN:6.9]
I was able to combine the tuple channel with the csv entries using the each
keyword. This results in a cross-product of all 36 csv rows for each tuple. I then added a when
condition to do the filtering:
input:
tuple sampleid, reads from fq
each samplemeta from samplelist
...
when:
sampleid.contains(samplemeta.samplename)
This works but I'm curious if this is an appropriate solution. What is the correct way to dynamically join channels using a regular expression, by matching a value from one channel against multiple values from a second channel?
Upvotes: 3
Views: 1022
Reputation: 54502
I tend to avoid using the each qualifier like this because of this recommendation in the docs:
If you need to repeat the execution of a process over n-tuple of elements instead a simple values or files, create a channel combining the input values as needed to trigger the process execution multiple times. In this regard, see the combine, cross and phase operators.
I don't actually think there's a way to join channels using a regex, but what you can do is use the combine operator to produce the Cartesian product of the items emitted by two channels. And if you supply the by
parameter, you can combine the items that share a common matching key. For example, untested:
params.reads = '/path/to/fastq/*_{,L00?}_R{1,2}.fq.gz'
Channel
.fromPath('sample_list.csv')
.splitCsv(header: true)
.map { row -> tuple( row.samplename, row ) }
.set { sample_metadata }
Channel
.fromFilePairs( params.reads )
.combine( sample_metadata, by: 0 )
.set { test_inputs }
process test {
input:
tuple val(sample_id), path(reads), val(metadata) from test_inputs
script:
def (fq1, fq2) = reads
"""
echo "sample_id: ${sample_id}"
echo "reads: ${fq1}, ${fq2}"
echo "metadata: ${metadata}"
"""
}
Upvotes: 1