charles hendry
charles hendry

Reputation: 1740

Extracting specific data from text file in Perl

I am new to Perl and am trying to extract specific data from a file, which looks like this:

 Print of   9 heaviest strained elements:    


   Element no   Max strain 
      20004         9.6 % 
      20013         0.5 % 
      11189         0.1 % 
      20207         0.1 % 
      11157         0.1 % 
      11183         0.0 % 
      10665         0.0 % 
      20182         0.0 % 
      11160         0.0 % 


 ==================================================

I would like to extract the element numbers only (20004, 20013 etc.) and write these to a new file. The reading of the file should end as soon as the line (=========) is reached, as there are more element numbers with the same heading later on in the file. Hope that makes sense. Any advice much appreciated!

I now have this code, which gives me a list of the numbers, maximum 10 in a row:

my $StrainOut = "PFP_elem"."_$loadComb"."_"."$i";
open DATAOUT, ">$StrainOut" or die "can't open $StrainOut";  # Open the file for writing.

open my $in, '<', "$POSTout" or die "Unable to open file: $!\n";
my $count = 0;

 while(my $line = <$in>) {
  last if $line =~ / ={10}\s*/;
  if ($line =~ /% *$/) {
    my @columns = split "         ", $line;
    $count++;
    if($count % 10 == 0) {
      print DATAOUT "$columns[1]\n";
    }
    else {
      print DATAOUT "$columns[1] ";
    }      
  }
}
close (DATAOUT);
close $in;

What needs changing is the "my @columns = split..." line. At the moment it splits up the $line scalar whenever it has '9 spaces'. As the number of digits of the element numbers can vary, this is a poor way of extracting the data. Is it possible to just read from left to right, omitting all spaces and recording numbers only until the numbers are followed by more spaces (that way the percentage value is ignored)?

Upvotes: 1

Views: 19630

Answers (5)

Birei
Birei

Reputation: 36282

A one-liner using flip-flop:

perl -ne '
  if ( m/\A\s*(?i)element\s+no/ .. ($end = /\A\s*=+\s*\Z/) ) {
    printf qq[$1\n] if m/\A\s*(\d+)/;
    exit 0 if $end
  }
' infile

Result:

20004
20013
11189
20207
11157
11183
10665
20182
11160

Upvotes: 1

eyevan
eyevan

Reputation: 1473

You could do it by running this one-liner in a command shell.

On *nix:

cat in_file.txt | perl -ne 'print "$1\n" if ( m/\s*(\d+)\s*\d+\.\d+/ )' > out_file.txt

On Windows:

type in_file.txt | perl -ne "print qq{$1\n} if ( m/\s*(\d+)\s*\d+\.\d+/ )" > out_file.txt

Upvotes: 0

flesk
flesk

Reputation: 7589

#!/usr/bin/perl
use strict;
use warnings;

open my $rh, '<', 'input.txt' or die "Unable to open file: $!\n";
open my $wh, '>', 'output.txt' or die "Unable to open file: $!\n";

while (my $line = <$rh>) {        
    last if $line =~ /^ ={50}/;
    next unless $line =~ /^ {6}(\d+)/;
    print $wh "$1\n";
}

close $wh;

Upvotes: 0

Alien Life Form
Alien Life Form

Reputation: 1944

#!/usr/bin/perl
use strict;
use warnings;

while (my $f= shift) {
   open(F, $f) or (warn("While opening $f: $!", next);
   my foundstart=0;
  while(<F>) {
     ($foundstart++, next) if /^\s#Element/;
     last if /\s*=+/;
     print $_ if $foundstart;
  }
  $foundstart=0;
  close(F);
}

Upvotes: 0

choroba
choroba

Reputation: 242383

#!/usr/bin/perl
use strict;
use warnings;

while (<>) {                        # read the file line by line
    if (/% *$/) {                   # if the line ends in a percent sign
        my @columns = split;        # create columns
        print $columns[0], "\n";    # print the first one
    }
    last if /={10}/;                # end of processing
}

Upvotes: 1

Related Questions