Reputation: 1418
I am trying to loop through my file and grab the lines in groups of 2. Every data entry in the file contains a header line and then the following line has the data.
I am trying to: Loop through the file, grab every two lines and manipulate them. My current problem is that I am trying to echo the next line in the loop. So every time I hit a header row, it will print the data line (next line) with it.
out="$(cat $1)" #file
file=${out}
iter=0
for line in $file;
do
if [ $((iter%2)) -eq 0 ];
then
#this will be true when it hits a header
echo $line
# I need to echo the next line here
fi
echo "space"
iter=$((iter+1))
done
Here is an example of a possible input file:
>fc11ba964421kjniwefkniojhsdeddb4_runid=65bedc43sdfsdfsdfsd76b7303_read=42_ch=459_start_time=2017-11-01T21:10:05Z <br>
TGAGCTATTATTATCGGCGACTATCTATCTACGACGACTCTAGCTACGACTATCGACTCGACTACSAGCTACTACGTACCGATC
>fd38df1sd6sdf9867345uh43tr8199_runid=65be1fasdfsdfgdsfg4376b7303_read=60_ch=424_start_time=2017-11-01T21:10:06Z <br>
TGAGCTATTATTATCGGCGACTATCTATCTACGACGACTCTAGCTACGACTATCGACTCGACTACSAGCTACTACGTACCGATC
>1d03jknsdfnjhdsf78sd89ds89cc17d_runid=65bedsdfsdfsdf03_read=24_ch=439_start_time=201711-01T21:09:43Z <br>
TGAGCTATTATTATCGGCGACTATCTATCTACGACGACTCTAGCTACGACTATCGACTCGACTACSAGCTACTACGTACCGATC
header lines start with >
and data is the lines containing TGACATC
For those asking about the output, based on the original question, I am trying to access the header and data together. Each header and matching data will be processed 6 times. The end goal is to have each header and data pair:
>fc11ba964421kjniwe (original header)
GATATCTAGCTACTACTAT (original data)
translate to:
>F1_fc11ba964421kjniwe
ASNASDKLNASDHGASKNHDLK
>F2_fc11ba964421kjniwe
ASHGASKNHDLKNASDKLNASD
>F3_fc11ba964421kjniwe
KNHDLKNASDKLNASDASHGAS
>R1_fc11ba964421kjniwe
ASHGLKNASDKLNASDASKNHD
>R2_fc11ba964421kjniwe
AKNASDKLNASDSHGASKNHDL
>R3_fc11ba964421kjniwe
SKNHDLKNASDKASHGALNASD
and then the next header and data entry would generate another 6 lines
Upvotes: 0
Views: 355
Reputation: 46856
Your for line in $file
notation cannot work; in bash, the text after in
is a series of values, not an input file. What you're probably looking for is a while read
loop that takes the file as standard input. Something like this:
while read -r header; do
# We should be starting with a header.
if [[ $header != >* ]]; then
echo "ERROR: corrupt header: $header" >&2
break
fi
# read the next line...
read -r data
printf '%s\n' "$data" >> data.out
done < "$file"
I don't know what output you're looking for, so I just made something up. This loop enforces header position with the if
statement, and prints data lines to an output file.
Of course, if you don't want this enforcement, you could simply:
grep -v '^>' "$file"
to return lines which are not headers.
Upvotes: 0
Reputation: 531205
If you know your records each consist of exactly 2 lines, use the read
command twice on each iteration of the while
loop.
while IFS= read -r line1; IFS= read -r line2; do
...
done < "$1"
Upvotes: 1