Reputation: 1
My bash script is the following:
#!/bin/bash
if [ ! -f "$1" ]; then
exit
fi
while read line;do
str1="[GAC]*T"
num=$"(echo $line | tr -d -c 'T' | wc -m)"
for((i=0;i<$num;i++))do
echo $line | sed "s/$str1/&\n/" | head -n1 -q
str1="${str1}[GAC]*T"
done
str1="[GAC]*T"
done < "$1
While it works normally as it should (take the filename input and print it line by line until the letter T and next letter T and so on) it prints to the terminal.
Input:
GATTT
ATCGT
Output:
GAT
GATT
GATTT
AT
ATCGT
When I'm using the script with | tee outputfile
the outputfile is correct but when using the script with > outputfile
the terminal hangs / is stuck and does not finish. Moreover it works with bash -x scriptname inputfile > outputfile
but is stuck with bash scriptname inputfile > outputfile
.
Upvotes: 0
Views: 373
Reputation: 7627
I made modifications to your original script, please try:
if [ ! -f "$1" ]; then
exit
fi
while IFS='' read -r line || [[ -n "$line" ]];do
str1="[GAC]*T"
num=$(echo $line | tr -d -c 'T' | wc -m)
for((i=0;i<$num;i++));do
echo $line | sed "s/$str1/&\n/" | head -n1 -q
str1="${str1}[GAC]*T"
done
str1="[GAC]*T"
done < "$1"
For input:
GATTT
ATCGT
This script outputs:
GAT
GATT
GATTT
AT
ATCGT
Modifications made to your original script were:
Line while read line; do
changed to while IFS='' read -r line || [[ -n "$line" ]]; do
. Why I did this is explained here: Read a file line by line assigning the value to a variable
Line num=$"(echo $line | tr -d -c 'T' | wc -m)"
changed to num=$(echo $line | tr -d -c 'T' | wc -m)
Line for((i=0;i<$num;i++))do
changed to for((i=0;i<$num;i++));do
Line done < "$1
changed to done < "$1"
Now you can do: ./scriptname inputfile > outputfile
Upvotes: 1
Reputation: 36250
Try:
sed -r 's/([^T]*T+)/\1\n/g' gatc.txt > outputfile
instead of your script.
It takes some optional non-Ts, followed by at least one T and inserts a newline after the T.
cat gatc.txt
GATGATTGATTTATATCGT
sed -r 's/([^T]*T+)/\1\n/g' gatc.txt
GAT
GATT
GATTT
AT
AT
CGT
For multiple lines, to delete empty lines in the end:
echo "GATTT
ATCGT" | sed -r 's/([^T]*T+)/\1\n/g;' | sed '/^$/d'
GATTT
AT
CGT
Upvotes: 0