flashbang
flashbang

Reputation: 172

Searching Perl array with regex and return single capturing group only

I have a Perl script in which I perform web service calls in a loop. The server returns a multivalued HTTP header that I need to parse after each call with information that I will need to make the next call (if it doesn't return the header, I want to exit the loop).

I only care about one of the values in the header, and I need to get the information out of it with a regular expression. Let's say the header is like this, and I only care about the "foo" value:

X-Header: test-abc12345; blah=foo
X-Header: test-fgasjhgakg; blah=bar

I can get the header values like this: @values = $response->header( 'X-Header' );. But how do I quickly check if

  1. There is a foo value, and
  2. Parse and save the foo value for the next iteration?

Ideally, I'd like to do something like this:

my $value = 'default';

do {
  # (do HTTP request; use $value)
  @values = $response->header( 'X-Header' );
} while( $value = first { /(?:test-)([^;]+)(?:; blah=foo)/ } @values );

But grep, first (from List::Util), etc. all return the entire match and not just the single capturing group I want. I want to avoid cluttering up my code by looping over the array and matching/parsing inside the loop body.

Is what I want possible? What would be the most compact way to write it? So far, all I can come up with is using lookarounds and \K to discard the stuff I don't care about, but this isn't super readable and makes the regex engine perform a lot of unnecessary steps.

Upvotes: 3

Views: 125

Answers (3)

zdim
zdim

Reputation: 66891

So it seems that you want to catch the first element with a certain pattern, but acquire only the pattern. And you want it done nicely. Indeed, first and grep only pass the element itself.

However, List::MoreUtils::first_result does support processing of its match

use List::MoreUtils 0.406 qw(first_result);

my @w = qw(a bit c dIT);  # get first "it" case-insensitive

my $res = first_result { ( /(it)/i )[0] } @w;

say $res // 'undef';  #--> it

That ( ... )[0] is needed to put the regex in the list context so that it returns the actual capture. Another way would be firstres { my ($r) = /(it)/i; $r }. Pick your choice


For the data in the question

use warnings;
use strict;
use feature 'say';

use List::MoreUtils 0.406 qw(firstres);

my @data = ( 
    'X-Header: test-abc12345; blah=foo',
    'X-Header: test-fgasjhgakg; blah=bar'
);

if (my $r = firstres { ( /test-([^;]+);\s+blah=foo/ )[0] } @data) {
    say $r
}

Prints abc12345, clarified in a comment to be the sought result.


Module versions prior to 0.406 (of 2015-03-03) didn't have firstres (alias first_result)

Upvotes: 3

ikegami
ikegami

Reputation: 385917

first { ... } @values returns one the values (or undef).

You could use either of these:

my ($value) = map { /...(...).../ } @values;

my $value = ( map { /...(...).../ } @values ) ? $1 : undef;

my $value = ( map { /...(...).../ } @values )[0];

Using first, it would look like the following, which is rather silly:

my $value = first { 1 } map { /...(...).../ } @values;

However, assuming the capture can't be an empty string or the string 0, List::MoreUtils's first_result could be used to avoid the unnecessary matches:

my $value = first_result { /...(...).../ ? $1 : undef } @values;

my $value = first_result { ( /...(...).../ )[0] } @values;

If the returned value can be false (e.g. an empty string or a 0) you can use something like

my $value = first_result { /...(...).../ ? \$1 : undef } @values;
$value = $$value if $value;

The first_result approach isn't necessarily faster in practice.

Upvotes: 2

Polar Bear
Polar Bear

Reputation: 6798

Following code snippet is looking for foo stored in a variable $find, the found values is stored in variable $found.


my $find = 'foo';
my $found;

while( $response->header( 'X-Header' ) ) {
    if( /X-Header: .*?blah=($find)/ ) {
        $found = $1;
        last;
    }
}

say $found if $found;

Sample demo code

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my $find = 'foo';
my $found;
my @header = <DATA>;

chomp(@header);

for ( @header ) {
    $found = $1 if /X-Header: .*?blah=($find)/;
    last if $found;
}

say Dumper(\@header);
say "Found: $found" if $found;

__DATA__
X-Header: test-abc12345; blah=foo
X-Header: test-fgasjhgakg; blah=bar

Output

$VAR1 = [
          'X-Header: test-abc12345; blah=foo',
          'X-Header: test-fgasjhgakg; blah=bar'
        ];

Found: foo

Upvotes: 1

Related Questions