Reputation: 311
I have a directory with many fq.gz files. I want to loop over the filenames and concatenate any files with the same partial ID. For example out of the 1000 files in the directory, these six need to be concatenated into a single file (as they share the same ID From "L1" onwards)
141016-FC012-L1-N707-S504--123V_pre--Hs--R1.fq.gz
141031-FC01229-L1-N707-S504--123V_pre--Hs--R1.fq.gz
141020-FC01209-L1-N707-S504--123V_pre--Hs--R1.fq.gz
141027-FC013-L1-N707-S504--123V_pre--Hs--R1.fq.gz
141023-FC01219-L1-N707-S504--123V_pre--Hs--R1.fq.gz
Can anyone help??
Upvotes: 0
Views: 72
Reputation: 7519
Probably not the best way, but this might do what you need:
while IFS= read -r -d '' id; do
cat *"$id" > "/some/location/${id%.fq.gz}_grouped.fq.gz"
done < <(printf '%s\0' *.fq.gz | cut -zd- -f3- | sort -uz)
This will create files with the following format:
<ID>_grouped.fq.gz
L1-N707-S504--123V_pre--Hs--R1_grouped.fq.gz
...
...
Upvotes: 1