diplodocuscoffeespot
diplodocuscoffeespot

Reputation: 99

How to use key word from a current line of text to search in previous lines of already read text in Perl?

I am starting out in Perl and am parsing some text line by line in a while loop and could not find help on this particular problem. I would like to use information from previously read lines of text in a current line of text.

My code is as follows:

while(<data>){

    my $message = substr $_, 0, 1;

    if ($message eq 'A'){

        my $order_ref  = substr $_,  1, 9;
        my $order_book = substr $_, 20, 6;

        push @add_orders, $_;
        print add_order_file "$order_ref,$order_book\n";
    }
    if ($message eq 'X'){

        my $order_ref = substr $_, 1, 9;
        #now I would like to use order_ref to look up order_book from a previous line of text 
        # where the message is equal to A, 
        my $order_book = LOOKED UP VALUE FROM PREVIOUS TEXT;

        push @add_orders, $_;
        print add_order_file "$order_ref,$order_book\n";
    }
}

"A" messages always precede "X" messages, so I know for sure that if I see an X entry with an order_ref number I scroll back and find the associated A message where I can pull out the order_book variable. I realize this will involve regexp's of some sort but I have no idea how to make Perl search previous lines only. Thanks!

EDIT: I should be clearer on this. "A" messages precede "X" messages, but they can all have different order_refs, so the data looks like this:

A order_ref1, order_book1
A order_ref2,order_book2
A order_ref3,order_book1
X order_ref2 
X order_ref1

For the X orders I want to look up the order_book using order_ref2 and order_ref1.

Upvotes: 1

Views: 107

Answers (2)

TLP
TLP

Reputation: 67910

With your re-definition of your entire question, a new answer is required.

You need to store your order_refs in a hash, to use for later lookup. This variable needs to be declared outside the while loop.

Note that I have changed the numbers in your substr calls to match your sample input. If you share some information on how the input lines are constructed, there may be a better way to extract the different values. Using substr assumes a fixed width type data.

use strict;
use warnings;

my %order_book;  # your lookup hash
my @add_orders;
while (<DATA>) {
    chomp;
    my $message = substr $_, 0, 1;

    if ($message eq 'A' or $message eq 'X') {

        my $order_ref = substr $_, 2, 10;
        if ($message eq 'A') {

            $order_book{$order_ref} = substr $_, 13;
        }
        push @add_orders, $_;
        print "$order_ref,$order_book{$order_ref}\n";
    }
}

__DATA__
A order_ref1,order_book1
A order_ref2,order_book2
A order_ref3,order_book1
X order_ref2 
X order_ref1
X order_ref3

Output:

order_ref1,order_book1
order_ref2,order_book2
order_ref3,order_book1
order_ref2,order_book2
order_ref1,order_book1
order_ref3,order_book1

Upvotes: 5

simbabque
simbabque

Reputation: 54373

TLP's answer already is correct. Here are some more suggestions to your code:

use strict; use warnings;
my @add_orders;
my $last_order_book;
while (my $line = <DATA>) {
  my $message = substr $line, 0, 1;

  if ( $message eq "A" ) {
    my $order_ref  = substr $line, 1,  9;
    my $order_book = $last_order_book = substr $line, 20, 6;

    push( @add_orders, $line );
    print "$order_ref,$order_book\n";
  }
  elsif ( $message eq "Q" ) {
    # Stuff happening ...
  }
  elsif ( $message eq "X" ) {
    my $order_ref = substr $line, 1, 9;

    my $order_book = $last_order_book;

    push( @add_orders, $line );
    print "$order_ref,$order_book\n";
  }
}

__DATA__
A123456789012345678901234567890
XLine XLine XLine XLine XABCDEF

I've changed a couple of things in the code.

First of all, let's answer your question: You can add a variable that is scoped outside of the block to store your $order_book if you do not want to use the one you had inside the loop. I named it $last_order_book. It remembers the last seen value from the "A" part. Note that you can assign values to multiple variables by chaining them like my $foo = my $bar = "baz".

Now to my suggestions:

  • Always use strict and use warnings. I don't know if you did, but I'll say it just in case.
  • You are using $_ a lot. I believe that if you have to use it explicitly very often then you should actually just give it a name and use that instead. It will save you trouble understanding what is going on later.
  • Each line can only ever have one kind of $message, so it does not make sense to have multiple if {} constructs. Instead, use if {} elsif {} and sort them by the number of times each kind of line occurs. That will save time because it stops executing the whole if-construct once it found one of the conditions. This is useful if you are dealing with a lot of data, but it does not hurt to always do it this way. In order to make it more clear, I added a $message eq "Q" case.

Upvotes: 0

Related Questions