Sam Munroe
Sam Munroe

Reputation: 1418

Access next item in for loop bash

I am trying to loop through my file and grab the lines in groups of 2. Every data entry in the file contains a header line and then the following line has the data.

I am trying to: Loop through the file, grab every two lines and manipulate them. My current problem is that I am trying to echo the next line in the loop. So every time I hit a header row, it will print the data line (next line) with it.

out="$(cat $1)" #file
file=${out}

iter=0
for line in $file;
do
    if [ $((iter%2)) -eq 0 ];
    then
            #this will be true when it hits a header
            echo $line
            # I need to echo the next line here
    fi
    echo "space"
    iter=$((iter+1))

done

Here is an example of a possible input file:

>fc11ba964421kjniwefkniojhsdeddb4_runid=65bedc43sdfsdfsdfsd76b7303_read=42_ch=459_start_time=2017-11-01T21:10:05Z <br>
TGAGCTATTATTATCGGCGACTATCTATCTACGACGACTCTAGCTACGACTATCGACTCGACTACSAGCTACTACGTACCGATC
>fd38df1sd6sdf9867345uh43tr8199_runid=65be1fasdfsdfgdsfg4376b7303_read=60_ch=424_start_time=2017-11-01T21:10:06Z <br>
TGAGCTATTATTATCGGCGACTATCTATCTACGACGACTCTAGCTACGACTATCGACTCGACTACSAGCTACTACGTACCGATC
>1d03jknsdfnjhdsf78sd89ds89cc17d_runid=65bedsdfsdfsdf03_read=24_ch=439_start_time=201711-01T21:09:43Z <br>
TGAGCTATTATTATCGGCGACTATCTATCTACGACGACTCTAGCTACGACTATCGACTCGACTACSAGCTACTACGTACCGATC

header lines start with > and data is the lines containing TGACATC

EDIT:

For those asking about the output, based on the original question, I am trying to access the header and data together. Each header and matching data will be processed 6 times. The end goal is to have each header and data pair:

>fc11ba964421kjniwe (original header)
GATATCTAGCTACTACTAT (original data)

translate to:

>F1_fc11ba964421kjniwe
ASNASDKLNASDHGASKNHDLK
>F2_fc11ba964421kjniwe
ASHGASKNHDLKNASDKLNASD
>F3_fc11ba964421kjniwe
KNHDLKNASDKLNASDASHGAS
>R1_fc11ba964421kjniwe
ASHGLKNASDKLNASDASKNHD
>R2_fc11ba964421kjniwe
AKNASDKLNASDSHGASKNHDL
>R3_fc11ba964421kjniwe
SKNHDLKNASDKASHGALNASD

and then the next header and data entry would generate another 6 lines

Upvotes: 0

Views: 355

Answers (2)

ghoti
ghoti

Reputation: 46856

Your for line in $file notation cannot work; in bash, the text after in is a series of values, not an input file. What you're probably looking for is a while read loop that takes the file as standard input. Something like this:

while read -r header; do

  # We should be starting with a header.
  if [[ $header != >* ]]; then
    echo "ERROR: corrupt header: $header" >&2
    break
  fi

  # read the next line...
  read -r data

  printf '%s\n' "$data" >> data.out

done < "$file"

I don't know what output you're looking for, so I just made something up. This loop enforces header position with the if statement, and prints data lines to an output file.

Of course, if you don't want this enforcement, you could simply:

grep -v '^>' "$file"

to return lines which are not headers.

Upvotes: 0

chepner
chepner

Reputation: 531205

If you know your records each consist of exactly 2 lines, use the read command twice on each iteration of the while loop.

while IFS= read -r line1; IFS= read -r line2; do
    ...
done < "$1"

Upvotes: 1

Related Questions