Michelangelo
Michelangelo

Reputation: 27

Split file Perl

I want to split parts of a file. Here is what the start of the file looks like (it continues in same way):

Location    Strand  Length    PID      Gene 
1..822        +      273    292571599  CDS001
906..1298     +      130    292571600   trxA

I want to split in Location column and subtract 822-1 and do the same for every row and add them all together. So that for these two results the value would be: (822-1)+1298-906) = 1213 How? My code right now, (I don't get any output at all in the terminal, it just continue to process forever):

use warnings;
use strict;


my $infile = $ARGV[0];             # Reading infile argument
open my $IN, '<', $infile or die "Could not open $infile: $!, $?";

my $line2 = <$IN>;


my $coding = 0;                   # Initialize coding variable
while(my $line = $line2){          # reading the file line by line
    # TODO Use split and do the calculations
     my @row = split(/\.\./, $line);
     my @row2 = split(/\D/, $row[1]);

     $coding += $row2[0]- $row[0];

}

print "total amount of protein coding DNA: $coding\n";

So what I get from my code if I put:

print "$coding \n";

at the end of the while loop just to test is:

821 
1642

And so the first number is correct (822-1) but the next number doesn't make any sense to me, it should be (1298-906). What I want in the end outside the loop:

print "total amount of protein coding DNA: $coding\n";

is the sum of all the subtractions of every line i.e. 1213. But I don't get anything, just a terminal that works on forever.

Upvotes: 1

Views: 75

Answers (2)

Dave Cross
Dave Cross

Reputation: 69314

Explicitly opening the file makes your code more complicated than it needs to be. Perl will automatically open any files passed on the command line and allow you to read from them using the empty file input operator, <>. So your code becomes as simple as this:

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

my $total;

while (<>) {
  my ($min, $max) = /(\d+)\.\.(\d+)/;

  next unless $min and $max;

  $total += $max - $min;
}

say $total;

If this code is in a file called adder and your input data is in add.dat, then you run it like this:

$ adder add.dat
1213

Update: And, to explain where you were going wrong...

You only ever read a single line from your file:

my $line2 = <$IN>;

And then you continually assign that same value to another variable:

while(my $line = $line2){          # reading the file line by line

The comment in this line is wrong. I'm not sure where you got that line from.

To fix your code, just remove the my $line2 = <$IN> line and replace your loop with:

while (my $line = <$IN>) {
  # your code here
}

Upvotes: 0

Shawn
Shawn

Reputation: 52644

As a one-liner:

perl -nE '$c += $2 - $1 if /^(\d+)\.\.(\d+)/; END { say $c }' input.txt

(Extracting the important part of that and putting it into your actual script should be easy to figure out).

Upvotes: 2

Related Questions