Reputation: 29076
In this regular expression:
/\b([aeiouy])\w{2}\1\w+\b/g
The capture group is only used by the back reference \1
.
Is it possible to declare a named group with only exist for back reference?
The only thing I can do to exclude ([aeiouy])
from the matches is (?:)
, but in that case I cannot use my back reference anymore.
For example in Perl:
#!/usr/bin/perl
use 5.010;
$_ = 'accordion accalmie diacritic ettercap';
say join ' ', /\b(([aeiouy])\w{2})\2(\w+)\b/g;
Where I want to display this:
acc lmie ett rcap
not this:
acc a lmie ett e rcap
Another solution would involve named group:
#!/usr/bin/perl
use Data::Dumper;
$_ = 'accordion accalmie diacritic ettercap';
print Dumper \%+ while /\b(?<pre>([aeiouy])\w{2})\2(?<post>\w+)\b/g;
$VAR1 = {
'post' => 'lmie',
'pre' => 'acc'
};
$VAR1 = {
'post' => 'rcap',
'pre' => 'ett'
};
EDIT
Another example that might be better than the above one is this case:
m/(?<=<(name)>)\w+(?=<\/\1>)/g
Where I want to match foo
and bar
<item>
<name>foo</name>
<id>23</id>
</item>
<item>
<name>bar</name>
<id>42</id>
</item>
The group (name)
allow to not repeat myself and here I am using lookaround to properly match foo
and bar
. However, this solution is less clean than
m/(?<=<name>)\w+(?=<\/name>)/g
that will not return any irrelevant capture groups. In my original question I am trying to find a way to refer to a capture group without using it outside the regex.
Upvotes: 1
Views: 413
Reputation: 5395
It is not direct answer for a question, but I think that this kind of match could be achieved with regex like:
(?=\b([aeiouy])\w{2}\1\w+\b)\w{3}|(?<=(?!\A)\G[aeiouy])\w+\b
which should match acc
and lmie
, as separate matches.
Upvotes: 1
Reputation: 507
Strictly speaking, this is not an answer to your question but I cannot comment here on Stack Overflow yet.
Why not take $1
and $3
directly, avoiding what you don't want ($2
)?
#!/usr/bin/perl
use 5.010;
$_ = 'accordion accalmie diacritic ettercap';
my @parts;
push @parts, $1, $3 while /\b(([aeiouy])\w{2})\2(\w+)\b/g;
say join ' ', @parts;
# prints "acc lmie ett rcap\n"
Upvotes: 1