Reputation: 13
I'm trying to merge multiple sets of 2 fastq files from the same sequencing library. I have a txt file with all the sample names in it. The samples were sequenced in paired-end so there're both _1.fastq.gz and _2.fastq.gz files associated with each sample.
SRR_Acc_list.txt
SRR1
SRR2
SRR3
SRR4
...
The following code is what I'm trying to achieve: combining SRR1 and SRR2 for both read 1 and read 2 into one fastq files in the output folder combined_fastq.
cat SRA/SRR1_1.fastq.gz SRA/SRR2_1.fastq.gz > combined_fastq/SRR1_1.fastq.gz
cat SRA/SRR1_2.fastq.gz SRA/SRR2_2.fastq.gz > combined_fastq/SRR1_2.fastq.gz
I'm having trouble figuring out how to do this for the rest of the samples. Such as combining SRR3 and SRR4, SRR5 and SRR6 and so forth in a loop.
Upvotes: 1
Views: 1244
Reputation: 207365
Like most folk on StackOverflow, I have no idea about bioinformatics, fastq or "paired-ends", however I can reproduce the pattern you seem to want:
xargs -n2 < SRR_Acc_list.txt |
while read a b ; do
for c in 1 2 ; do
echo $a, $b, $c
done
done
Sample Output
SRR1, SRR2, 1
SRR1, SRR2, 2
SRR3, SRR4, 1
SRR3, SRR4, 2
Upvotes: 1