Reputation: 21
I just started learning to use command line. Hopefully this is not a dump question.
I have the following files in my directory:
L001_R1_001.fastq
L002_R2_001.fastq
L004_R1_001.fastq
L005_R2_001.fastq
L001_R2_001.fastq
L003_R1_001.fastq
L004_R2_001.fastq
L006_R1_001.fastq
L002_R1_001.fastq
L003_R2_001.fastq
L005_R1_001.fastq
L006_R2_001.fastq
You can see in the filenames, it's a mix of R1 and R2 and the numbers after L00 are not sorted.
I want to concatenate files in the order of filename, separately for R1 and R2 files.
If I do it manually, it will look like the following:
# for R1 files
cat L001_R1_001.fastq L002_R1_001.fastq L003_R1_001.fastq L004_R1_001.fastq L005_R1_001.fastq L006_R1_001.fastq > R1.fastq
# for R2 files
cat L001_R2_001.fastq L002_R2_001.fastq L003_R2_001.fastq L004_R2_001.fastq L005_R2_001.fastq L006_R2_001.fastq > R2.fastq
Could you please help me write a script that I can re-use later? Thank you!
Upvotes: 2
Views: 8351
Reputation: 87271
cat `ls -- *_R1_*.fastq | sort` >R1.fastq
cat `ls -- *_R2_*.fastq | sort` >R2.fastq
The | sort
is not needed on most systems because ls
sorts the files by name.
If the names of the files contain whitespace, then do this first:
IFS='
'
Upvotes: 4
Reputation: 11703
Try using wildcard character *
. It will automatically expand file names in alphabetical order.
cat L*_R1_001.fastq > R1.fastq
cat L*_R2_001.fastq > R2.fastq
EDIT:
If above command doesn't give desired sorting, try overriding locale setting using LC_ALL=C
as sugested by Fredrik Pihl
LC_ALL=C cat L*_R1_001.fastq > R1.fastq
Upvotes: 1