Reputation: 103
I am trying to extract the first complete number on each line from a text file like this:
8 gcaggcaaactgcgataataaaaggctgtttcaacagcggagtggattgt 1.5307684822361e-176
11 tttacccagtgagtttgaagcaaggatcttttagtttaccgaaaaatgag 3.22210306380202e-293
14 agcaatagcgcgaacagacaacctcatcagtctaccgcgcaccctttccc 1.32107737963584e-52
20 agtgacagggaaaggcgatcgcggctttacgatcagagatcggtgtcggt 0.942504155078175
30 tccggagactttcgattgcatgcaattcaccatcataccctcttgccctc 0
45 actgagcccctgacgctggccagtgtagcgctgtgaagtcccctctcagg 9.49147409471272e-307
53 gaaccgagcgatcgctgctgccattgtctcgccttctgccgaggaatgcc 2.15850303270505e-28
using the regex in the following code:
my $id = undef;
while (my $line = <INFILE>){
chomp $line;
if ($line =~ /\A([0-9]+)/){
$id = $1;
}
print OUTFILE "$id\n";
$line = <INFILE>;
chomp $line;
}
The output I'm getting only includes every other line:
8
14
30
53
I've tried printing out every line without doing the match, and everything is there. Once I add the regex, it skips every other line. Any ideas why it's doing this?
Upvotes: 3
Views: 1023
Reputation: 21351
You are skipping file lines
while (my $line = <INFILE>) { # Reading line once
chomp $line;
if ($line =~ /\A([0-9]+)/){
$id = $1;
}
print OUTFILE "$id\n";
$line = <INFILE>; # Reading line again!!!!!
}
because you are calling
$line = <INFILE>;
twice. You do not need to have the second $line = <INFILE>
in your code.
Upvotes: 1
Reputation: 27589
You're reading from the INFILE
handle twice, once in the while
condition, and once at the end of the loop.
Remove, the final read:
my $id = undef;
while (my $line = <INFILE>){
chomp $line;
if ($line =~ /\A([0-9]+)/){
$id = $1;
}
print OUTFILE "$id\n";
}
Upvotes: 4