nowox
nowox

Reputation: 29076

Using a named group only for a back reference

In this regular expression:

/\b([aeiouy])\w{2}\1\w+\b/g

The capture group is only used by the back reference \1.

Is it possible to declare a named group with only exist for back reference?

The only thing I can do to exclude ([aeiouy]) from the matches is (?:), but in that case I cannot use my back reference anymore.

For example in Perl:

#!/usr/bin/perl
use 5.010;
$_ = 'accordion accalmie diacritic ettercap';
say join ' ', /\b(([aeiouy])\w{2})\2(\w+)\b/g;

Where I want to display this:

acc lmie ett rcap

not this:

acc a lmie ett e rcap

Another solution would involve named group:

#!/usr/bin/perl
use Data::Dumper;
$_ = 'accordion accalmie diacritic ettercap';
print Dumper \%+ while /\b(?<pre>([aeiouy])\w{2})\2(?<post>\w+)\b/g;

$VAR1 = {
          'post' => 'lmie',
          'pre' => 'acc'
        };
$VAR1 = {
          'post' => 'rcap',
          'pre' => 'ett'
        };

EDIT

Another example that might be better than the above one is this case:

m/(?<=<(name)>)\w+(?=<\/\1>)/g

Where I want to match foo and bar

<item>
   <name>foo</name>
   <id>23</id>
</item>
<item>
   <name>bar</name>
   <id>42</id>
</item>

The group (name) allow to not repeat myself and here I am using lookaround to properly match foo and bar. However, this solution is less clean than

m/(?<=<name>)\w+(?=<\/name>)/g

that will not return any irrelevant capture groups. In my original question I am trying to find a way to refer to a capture group without using it outside the regex.

Upvotes: 1

Views: 413

Answers (2)

m.cekiera
m.cekiera

Reputation: 5395

It is not direct answer for a question, but I think that this kind of match could be achieved with regex like:

(?=\b([aeiouy])\w{2}\1\w+\b)\w{3}|(?<=(?!\A)\G[aeiouy])\w+\b

which should match acc and lmie, as separate matches.

Upvotes: 1

polettix
polettix

Reputation: 507

Strictly speaking, this is not an answer to your question but I cannot comment here on Stack Overflow yet.

Why not take $1 and $3 directly, avoiding what you don't want ($2)?

#!/usr/bin/perl
use 5.010;
$_ = 'accordion accalmie diacritic ettercap';
my @parts;
push @parts, $1, $3 while /\b(([aeiouy])\w{2})\2(\w+)\b/g;
say join ' ', @parts;
# prints "acc lmie ett rcap\n"

Upvotes: 1

Related Questions