Eugene Barsky
Eugene Barsky

Reputation: 5992

Perl 6 regex variable and capturing groups

When I make a regex variable with capturing groups, the whole match is OK, but capturing groups are Nil.

my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;

$str ~~ / $rgx / ;
say ~$/;  # 12abc34
say $0;   # Nil
say $1;   # Nil

If I modify the program to avoid $rgx, everything works as expected:

my $str = 'nn12abc34efg';

my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;

$str ~~ / ($atom) \w+ ($atom) /;
say ~$/;  # 12abc34
say $0;   # 「12」
say $1;   # 「34」

Upvotes: 4

Views: 519

Answers (2)

perlpilot
perlpilot

Reputation: 101

The key observation is that $str ~~ / $rgx /; is a "regex inside of a regex". $rgx matched as it should and set $0 and $1 within it's own Match object, but then there was no where within the surrounding match object to store that information, so you couldn't see it. Maybe it's clear with an example, try this:

my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;

$str ~~ / $0=$rgx /;
say $/;

Note the contents of $0. Or as another example, let's give it a proper name:

my $str = 'nn12abc34efg';
my $atom = / \d ** 2 /;
my $rgx = / ($atom) \w+ ($atom) /;

$str ~~ / $<bits-n-pieces>=$rgx /;
say $/;

Upvotes: 4

piojo
piojo

Reputation: 6723

With your code, the compiler gives the following warning:

Regex object coerced to string (please use .gist or .perl to do that)

That tells us something is wrong—regex shouldn't be treated as strings. There are two more proper ways to nest regexes. First, you can include sub-regexes within assertions(<>):

my $str = 'nn12abc34efg';
my Regex $atom = / \d ** 2 /;
my Regex $rgx = / (<$atom>) \w+ (<$atom>) /;
$str ~~ $rgx;

Note that I'm not matching / $rgx /. That is putting one regex inside another. Just match $rgx.

The nicer way is to use named regexes. Defining atom and the regex as follows will let you access the match groups as $<atom>[0] and $<atom>[1]:

my regex atom { \d ** 2 };
my $rgx = / <atom> \w+ <atom> /;
$str ~~ $rgx;

Upvotes: 5

Related Questions