Michelle Lu
Michelle Lu

Reputation: 13

Concatenate multiple sets of 2 fastq files in BASH

I'm trying to merge multiple sets of 2 fastq files from the same sequencing library. I have a txt file with all the sample names in it. The samples were sequenced in paired-end so there're both _1.fastq.gz and _2.fastq.gz files associated with each sample.

SRR_Acc_list.txt
SRR1
SRR2
SRR3
SRR4
...

The following code is what I'm trying to achieve: combining SRR1 and SRR2 for both read 1 and read 2 into one fastq files in the output folder combined_fastq.

cat SRA/SRR1_1.fastq.gz SRA/SRR2_1.fastq.gz > combined_fastq/SRR1_1.fastq.gz

cat SRA/SRR1_2.fastq.gz SRA/SRR2_2.fastq.gz > combined_fastq/SRR1_2.fastq.gz

I'm having trouble figuring out how to do this for the rest of the samples. Such as combining SRR3 and SRR4, SRR5 and SRR6 and so forth in a loop.

Upvotes: 1

Views: 1244

Answers (1)

Mark Setchell
Mark Setchell

Reputation: 207365

Like most folk on StackOverflow, I have no idea about bioinformatics, fastq or "paired-ends", however I can reproduce the pattern you seem to want:

xargs -n2 < SRR_Acc_list.txt |
   while read a b ; do
      for c in 1 2 ; do
         echo $a, $b, $c
      done
   done

Sample Output

SRR1, SRR2, 1
SRR1, SRR2, 2
SRR3, SRR4, 1
SRR3, SRR4, 2

Upvotes: 1

Related Questions