Reputation: 447
Here I've got a problem with my shell script. In my data analysis pipeline, I need to concatenate multiple gzipped files priore downstream analysis. These gzipped files come in pairs, so I need to concatenate all pair1 together and all pair2 together. My script for this looks like this:
for f in "${pair1_fqs[@]}"; do
zcat "${f//\"/}" >> "$sampleID"_cat1.fq
done
for f in "${pair2_fqs[@]}"; do
zcat "${f//\"/}" >> "$sampleID"_cat2.fq
done
the problem is zcat and cat returns different results:
zcat myfile.gz | wc -l
75896232
cat myfile.gz| wc -l
82322094
I was wondering if anyone here knows what could be the reason for this discrepancy!
Upvotes: 1
Views: 6107
Reputation: 39507
zcat
will uncompress first then pipe wc -l
will counts the lines.
cat
will just pass the data read from the file then pipe to wc -l
will counts the lines.
Thats why you see different results, try cat
on the compressed file, you will see gibberish.
Now try zcat
on the compressed file, You will see your data.
Upvotes: 1