Reputation: 5759
I have a folder with a number of files in it which need to be combined and i'm looking for a good command to do it with. The files look like this:
Concatenate Together:
A1_S1_L001_R1_001.fastq.gz
A1_S1_L002_R1_001.fastq.gz
A1_S1_L003_R1_001.fastq.gz
A1_S1_L004_R1_001.fastq.gz
Concatenate Together:
A1_S1_L001_R2_001.fastq.gz
A1_S1_L002_R2_001.fastq.gz
A1_S1_L003_R2_001.fastq.gz
A1_S1_L004_R2_001.fastq.gz
Concatenate Together:
B1_S1_L001_R1_001.fastq.gz
B1_S1_L002_R1_001.fastq.gz
B1_S1_L003_R1_001.fastq.gz
B1_S1_L004_R1_001.fastq.gz
Concatenate Together:
B1_S1_L001_R2_001.fastq.gz
B1_S1_L002_R2_001.fastq.gz
B1_S1_L003_R2_001.fastq.gz
B1_S1_L004_R2_001.fastq.gz
etc.
So the pattern is that for hundreds of files the first letter and number (A1 or B1 here) refers to a group which contains in it two subgroups (R1 and R2). The main groups are A1-H1, A2-H2, and A3-H3. Within each subgroup there are four files (L001, L002, L003, L004).
Is there a good way to simply combine these with zcat (or really any other way)?
Upvotes: 0
Views: 189
Reputation: 23850
Something like this should do it:
cd "/path/to/the/directory" || exit 1
for num in {1..3}; do
for letter in {A..H}; do
for subgroup in R1 R2; do
zcat "$letter$num"_S1_L*_"$subgroup"_001.fastq.gz > "$letter$num-$subgroup"
done
done
done
You may have to adjust the name (and possibly path) of the output files. I used "$letter$num-$subgroup"
so e.g. B1-R1
.
Upvotes: 2