Ion
Ion

Reputation: 11

Using Regex/Grep to grab lines from an array using an array of patterns

After searching everywhere on the web, and being a noob to perl, for a solution to this I have decided to post on Stack.

I am looking to do is loop through array1 containing required matches (they will be different each time and could contain lots of patterns (well strings that need to be matched) but using this example so I can understand the problem). Then testing each element against a grep which is using array2 that contains some strings. Then printing out the lines that grep found to match the patterns used.

#!/usr/bin/perl
use strict;
use warnings;
use POSIX qw( strftime );
my (@regressions,@current_test_summary_file,@regression_links);

@regressions = ("test","table");
@current_test_summary_file = ("this is the line for test \n","this is the line for    table \n","this is the line for to\n");
foreach (@regressions)
{
print $_ . "\n";    
@regression_links = grep(/$_/, @current_test_summary_file);
}

foreach(@regression_links)
{
print $_ . "\n";
}

So would like to pick up only the first two elements instead of all three which is happening now.

Hopefully I've explained my problem properly. I've tried quite a few things (using qq for example) but have only really used grep to try this (unsure how I could do this approach using something different). If someone can point me in the right direction (and whether I should be using grep at all to solve this problem for that matter) I would be very grateful. Just tried this code below instead of just get the second element any ideas anyone ? (sorry can't reply to ur comment some reason but so u know axeman second idea worked).

foreach my $regression (@regressions)
{
print $regression . "\n";   
@regression_links = grep(/$regression/, @current_test_summary_file);
}

Upvotes: 1

Views: 6473

Answers (2)

Axeman
Axeman

Reputation: 29854

Inside of grep, $_ refers to the list element involved in the test. Also /abc/ is short for $_ =~ /abc/ so you're effectively testing $_ =~ /$_/ guess what the answer is likely to be (with no metacharacters)?

So you're passing all values into @regression_links.

What you need to do is save the value of $_. But since you're not using the simple print statement, you could just as easily reserve the $_ variable for the grep, like so:

foreach my $reg ( @regressions ) {
    print "$reg\n";
    @regression_links = grep(/$reg/, @current_test_summary_file );
}

However, you're resetting @regression_links with each loop, and a push would work better:

 push @regression_links, grep(/$reg/, @current_test_summary_file );

However, a for loop is a bad choice for this anyway, because you could get duplicates and you might not want them. Since you're matching by regex, one alternative with multiple criteria is to build a regex alternation. But in order to get a proper alternation, we need to sort it by length of string descending and then by alphabetic order (cmp).

# create the alternation expression
my $filter 
     = join( '|'
     , sort { length( $b ) <=> length( $a ) 
            || $a cmp $b 
            } 
       @regressions 
     );
@regression_links = grep( /$filter/, @current_test_summary_file );

Or other than concatenating a regex, if you wanted to test them separately, the better way would be with something like List::MoreUtils::any:

@regression_links
    = grep {
          my $c = $_; # save $_
          return any { /$c/ } @regressions;
      }  @current_test_summary_file
    ;

Upvotes: 4

CoffeeMonster
CoffeeMonster

Reputation: 2190

Axeman is correct and localising $_ with $reg will solve your problem. But as for pulling out matches I would naively push all matches onto @regression_links producing a list of (probably) multiple matches. You can then use List::MoreUtils::uniq to trim down the list. If you don't have List::MoreUtils installed you can just copy the function (its 2 lines of code).

# Axeman's changes
foreach my $reg (@regressions) {
    print "regression: $reg\n";
    # Push all matches.
    push @regression_links,  grep(/$reg/, @current_test_summary_file);
}
# Trim down the list once matching is done.
use List::MoreUtils qw/uniq/;
foreach ( uniq(@regression_links) ) {
   print "$_\n";
}

Upvotes: 0

Related Questions