Patrick Thorpe
Patrick Thorpe

Reputation: 370

Python regex error: "cannot refer to an open group"

I am creating rules for a reddit automoderator. It gets its rules from a YAML config file and the regexes are interpreted as Python regex.

I am trying to make the following regular expression work:

(https?://[\\w\\d:#@%/;$()~_?+-=\\.&]+\\.\\w{2,6})([\\S\\s]*\\1)

When I test it on https://pythex.org/ it works perfectly to achieve what I want.

Unfortunately my group reference at the end of the expression is causing an error when I copy the same regex into the config file:

Generated an invalid regex for body (regex): cannot refer to an open group

I have also tried this version with everything escaped just to make sure that the characters weren't interfering in any way:

(https?://[\\w\\d\\:\\#\\@\\%\\/\\;\\$\\(\\)\\~\\_\\?\\+\\-\\=\\.&]+\\.\\w{2,6})([\\S\\s]*\\1)

But I still get the same error. Does anyone know what I'm doing wrong here?

Upvotes: 3

Views: 2805

Answers (1)

Patrick Thorpe
Patrick Thorpe

Reputation: 370

I managed to fix the problem by changing the group selection to \2 instead of \1.

It turned out that YAML or AutoModerator were automatically putting parentheses around the whole expression, so any group references within must be 1 more than you would initially expect.

I had thought that this was the problem at the start, and tried the fix explained above, however due to a separate issue with the AutoModerator code, the fix had not appeared to have worked. All resolved now though; thanks for your patience and help.

Upvotes: 6

Related Questions