Reputation:
Perl provides all sorts of built-in variables to get bits of a string which matched a regular expression (e.g. $MATCH
, $&
, or ${^MATCH}
for the part of the string that matched the regex, $PREMATCH
, $`
, and ${^PREMATCH}
for the part of the string before the part that matched, etc).
Is there any way to get the portion of the regular expression which actually was used to match $MATCH
?
For example, say I have
my $string = "gC rL Ht Ns B lR cG sN tH";
my $re = qr/\b(a|b|c)\b/i;
$string =~ $re;
print "${^PREMATCH}\n";
print "$&\n";
print "${^POSTMATCH}\n";
The output will be
gC rL Ht Ns
B
lR cG sN tH
The part of the regex (/\b(a|b|c)\b/i
) which matched the string was b
, or perhaps more properly \bb\b
, with the case-insensitive switch i
. How can I get b
(ideally) or \bb\b
? I can't find any built-in variable which stores any part of the regex that matched, only parts of the string.
Thanks to the great hint in choroba's answer, it seems that using named capture groups and the %+
built-in variable will work:
$ perl -MData::Dumper -e '
"gC rL Ht Ns B lR cG sN tH" =~ /\b((?<a>a)|(?<b>b)|(?<c>c))\b/i;
print Dumper keys %+;'
$VAR1 = 'b';
Upvotes: 2
Views: 188
Reputation: 242443
It is generally not possible as regular expressions can be very complex. The string bydgijjj
matches (?:ax|by)[cd]*(ef|g[hi](?:j{2,}|klm))
, what would you like it to return? Can you imagine how complex it is?
You have to construct the regular expression in a way it will tell you:
"gC rL Ht Ns B lR cG sN tH" =~ /\b((a)|(b)|(c))\b/i;
print "a:$2\nb:$3\nc:$4\n"
Upvotes: 2