Reputation: 41
I am still having issue with my fastq script. This time I have an issue with how the script handle input.
this is part of my script
while read Sequence_Name && read Sequence && read Quality_name && read Quality_sequence
n=1
do
if [[ ${#Sequence} != ${#Quality_sequence} ]] ; then
echo "Length of Sequence $n different from Length of Quality Sequence"
echo ${#Sequence} #line added to see the ouput
echo $Sequence #line added to see the ouput
echo ${#Quality_sequence} #line added to see the ouput
echo $Quality_sequence #line added to see the ouput
fi
n=$((n+1))
done <$1
the issue is that the ouput does not look like the input. such as
this is what is in the input file :
@SRR1350630.196.1 HWUSI-EAS753_0012:8:1:7018:1029 length=24
TGTAAACATCCTACACTCTCAGCT
+SRR1350630.196.1 HWUSI-EAS753_0012:8:1:7018:1029 length=24
`__^\\aa_Z_ccccacc[a\cYc
@SRR1350630.197.1 HWUSI-EAS753_0012:8:1:8338:1032 length=24
TTTGGCAATGGTAGAACTCACACC
+SRR1350630.197.1 HWUSI-EAS753_0012:8:1:8338:1032 length=24
acaa^acc^ac[aacY^\`_cccc
but this is what the output gave me
Length of Sequence 196 different from Length of Quality Sequence
24 # this is the echo ${#sequence}
TGTAAACATCCTACACTCTCAGCT # this is the echo $Sequence
22 # this is the echo ${#Quality_sequence}
`__^\aa_Z_ccccacc[acYc #this is the echo $Quality_sequence
Length of Sequence 197 different from Length of Quality Sequence
24
TTTGGCAATGGTAGAACTCACACC
23
acaa^acc^ac[aacY^`_cccc
You may have noticed that the script remove all \ from the input except \\ that it is considered as one \. So inducing a shift in the lenght of the quality_sequence.
thanks for your help again
Upvotes: 1
Views: 40
Reputation: 80931
You are missing the -r
argument to read
. You basically always want it. It prevents read
from "interpreting" backslash escape sequences.
See http://mywiki.wooledge.org/BashFAQ/001 for discussion about how to properly read files by line/field correctly (which discusses -r
).
Upvotes: 1