Safwan Hak
Safwan Hak

Reputation: 45

Use named regex groups to shorten the pattern

I'm trying to use named groups in regex to reduce the size of the regex. However, even in its simplest form and after clearing all the 'noise', I cannot get it to work.

When I try the example in the documentation of regex101 it works.

/(?<first>a+) and again (\k<first>)/

here is my simplest example and it's not working.

(?<months>December)|^(next)\s(\k<months>)

here is the input "next" is not matching.

next December
December 2020  

I would expect "next December" to match for the first row but instead, it's only matching "December" in both rows.

you can find the regex here https://regex101.com/r/aEgfe8/3/

The documentation only mentions that the named group has to be on the left.

What am I missing?

Upvotes: 1

Views: 1452

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627468

You should use (?P>months) or (?P>months) instead of \k<months>:

(?<months>December)|^(next)\s((?&months))
                              ^^^^^^^^^^ 
(?<months>December)|^(next)\s((?P>months))
                              ^^^^^^^^^^^ 

See the regex demo

The \k<first> is a named backreference that only matches a matched text. Since (?<months>December) is in another branch and did not match, the backreference is empty.

To recurse, reuse, part of patterns in one regex you need to use regex named subroutines, and the syntax is (?&NAME).

See more details at Regular Expression Subroutines:

Perl uses (?1) to call a numbered group, (?+1) to call the next group, (?-1) to call the preceding group, and (?&name) to call a named group. You can use all of these to reference the same group. (?+1)(?'name'[abc])(?1)(?-1)(?&name) matches a string that is five letters long and consists only of the first.

In C#, regex subroutines are not supported. Just repeat the subpattern:

var rep = "December";
var ThePattern = $@"(?<months>{rep})|^(next)\s({rep})";

The pattern will look like (?<months>December)|^(next)\s(December).

Related links to questions about reusing part of a regex:

Upvotes: 3

Related Questions