Reputation: 75
I need to build a regex which matches any key in a given hash. This can easily achieved via
my $string = join('|', keys %hash);
my $regex = qr/$string/;
where %hash
is the hash in question. Out of curiosity: Is there a way to circumvent introducing the dummy variable $string
and directly compile the return value of an arbitrary function into a regex (other than defining a custom sub or prototype)? Formulated differently: Does qr
also have a functional form (like readpipe
for qx
)?
Thanks in advance!
Upvotes: 3
Views: 137
Reputation: 66883
With some warnings one can make that using the (??{ code })
extended pattern
use warnings;
use strict;
use feature 'say';
sub form_patt { return '[0-9]+' }
my $re = qr/ [a-z] (??{ form_patt() }) /x;
#say $re;
my $v = q(z23 b71);
my @m = $v =~ /($re)/g;
say "@m";
The construct allows any Perl code to run inside of a pattern, and the return value of that code
is treated as a pattern, compiled if it's a string (or used as-is if its a
qr//
object), then matched as if it were inserted instead of this construct.
So we get to generate a (sub)pattern by running code inside the regex itself (needn't be in qr
).
The downside then is that it may well be evaluated every time it is used (even if in a variable with a qr
object). Also, this is a complex feature; please do see documentation.
That regex needs adjustments, as discussed; at least so to escape the ASCII non-"word" characters with quotemeta. Most likely the patterns should be bound (anchored) as well.
If the keys come in a longer string (a broader requirement than the one in ysth's answer) then anchor the pattern by the word boundary \b
, so that searching for key
does not match keys
'\b(?:' . join('|', map { quotemeta } keys %hash) . ')\b'
This is one possibility, depending on the exact requirements which haven't been given.
Upvotes: 2
Reputation: 385764
If the keys are strings to match literally rather than regex patterns, you should be using the following to fix a code injection bug:
join '|', map quotemeta, keys %hash
On to your question, you can use either of the following:
my ($re) = map qr/$_/, EXPR;
my $re = ( map qr/$_/, EXPR )[0];
So, you'd use
my ($re) = map qr/$_/, join '|', map quotemeta, keys %hash;
my $re = ( map qr/$_/, join '|', map quotemeta, keys %hash )[0];
Upvotes: 2
Reputation: 132802
You probably don't want to form the pattern by yourself. Remember that long line of alternations can make a very inefficient regex, especially if the keys have common prefixes.
Regex::Assemble can take a list of patterns and make an efficient alternation.
use Regexp::Assemble;
use Regexp::Assemble;
my %hash = map { $_, 1 } qw(cat concat bird dog doge . [ | );
my $ra = Regexp::Assemble->new;
$ra->add( map { quotemeta } keys %hash );
print $ra;
And here's its regex:
(?^:(?:[.[]|(?:con)?cat|doge?|bird|\|))
Upvotes: 1
Reputation: 98398
You can interpolate arbitrary code into any double-quotish construct (qq, qr, qx).
$foo
interpolates a scalar variable. ${ code-returing-a-scalar-reference }
interpolates arbitrary code. So to avoid the dummy variable, you do:
my $regex = qr/${ \join('|', keys %hash) }/;
But as mentioned in comment, if you want to match only string equal to one of the keys, you want to add \A
(or ^
) and \z
anchors. Otherwise you will match any string containing one of the keys.
If your keys may contain regex metacharacters, you also want to quote them, like:
my $regex = qr/\A(?:${ \join('|', map quotemeta, sort keys %hash) })\z/;
Upvotes: 5