F. Carbon
F. Carbon

Reputation: 75

is there a way to compile the return value of some function into a regex in a single statement?

I need to build a regex which matches any key in a given hash. This can easily achieved via

my $string = join('|', keys %hash);
my $regex = qr/$string/;

where %hash is the hash in question. Out of curiosity: Is there a way to circumvent introducing the dummy variable $string and directly compile the return value of an arbitrary function into a regex (other than defining a custom sub or prototype)? Formulated differently: Does qr also have a functional form (like readpipe for qx)?

Thanks in advance!

Upvotes: 3

Views: 137

Answers (4)

zdim
zdim

Reputation: 66883

With some warnings one can make that using the (??{ code }) extended pattern

use warnings;
use strict;
use feature 'say';

sub form_patt { return '[0-9]+' }

my $re = qr/ [a-z] (??{ form_patt() }) /x; 
#say $re;

my $v = q(z23 b71);
my @m = $v =~ /($re)/g; 
say "@m";

The construct allows any Perl code to run inside of a pattern, and the return value of that code

is treated as a pattern, compiled if it's a string (or used as-is if its a qr// object), then matched as if it were inserted instead of this construct.

So we get to generate a (sub)pattern by running code inside the regex itself (needn't be in qr).

The downside then is that it may well be evaluated every time it is used (even if in a variable with a qr object). Also, this is a complex feature; please do see documentation.


That regex needs adjustments, as discussed; at least so to escape the ASCII non-"word" characters with quotemeta. Most likely the patterns should be bound (anchored) as well.

If the keys come in a longer string (a broader requirement than the one in ysth's answer) then anchor the pattern by the word boundary \b, so that searching for key does not match keys

'\b(?:' . join('|', map { quotemeta } keys %hash) . ')\b'

This is one possibility, depending on the exact requirements which haven't been given.

Upvotes: 2

ikegami
ikegami

Reputation: 385764

If the keys are strings to match literally rather than regex patterns, you should be using the following to fix a code injection bug:

join '|', map quotemeta, keys %hash

On to your question, you can use either of the following:

my ($re) = map qr/$_/, EXPR;

my $re = ( map qr/$_/, EXPR )[0];

So, you'd use

my ($re) = map qr/$_/, join '|', map quotemeta, keys %hash;

my $re = ( map qr/$_/, join '|', map quotemeta, keys %hash )[0];

Upvotes: 2

brian d foy
brian d foy

Reputation: 132802

You probably don't want to form the pattern by yourself. Remember that long line of alternations can make a very inefficient regex, especially if the keys have common prefixes.

Regex::Assemble can take a list of patterns and make an efficient alternation.

use Regexp::Assemble;

use Regexp::Assemble;

my %hash = map { $_, 1 } qw(cat concat bird dog doge . [ | );

my $ra = Regexp::Assemble->new;
$ra->add( map { quotemeta } keys %hash );

print $ra;

And here's its regex:

(?^:(?:[.[]|(?:con)?cat|doge?|bird|\|))

Upvotes: 1

ysth
ysth

Reputation: 98398

You can interpolate arbitrary code into any double-quotish construct (qq, qr, qx). $foo interpolates a scalar variable. ${ code-returing-a-scalar-reference } interpolates arbitrary code. So to avoid the dummy variable, you do:

my $regex = qr/${ \join('|', keys %hash) }/;

But as mentioned in comment, if you want to match only string equal to one of the keys, you want to add \A (or ^) and \z anchors. Otherwise you will match any string containing one of the keys.

If your keys may contain regex metacharacters, you also want to quote them, like:

my $regex = qr/\A(?:${ \join('|', map quotemeta, sort keys %hash) })\z/;

Upvotes: 5

Related Questions