Reputation: 2517
I am trying to create a regex (Perl-compatible, but not Perl itself) that matches the following criteria:
The regex I have come up with so far is:
^(.(?!\b(?:r)\d*\b))*$
Below is a table of examples. Some are working, some are failing.
For the input strings below:
Results
+-------------------------------+---------------+--------------+
| Input string | Desired Match | Actual Match |
+-------------------------------+---------------+--------------+
| Some text | yes | yes |
| Some textr1 | yes | yes |
| Some text default(r3) | yes | NO |
| Some text default(abc r3) | yes | NO |
| Some text default(r3 xyz) | yes | NO |
| Some text default(abc r3 xyz) | yes | NO |
| Some text r12 default(r3) | no | no |
| Some text r1 | no | no |
| Some r1 text | no | no |
| \sR12 Some text | no | no |
| Some text r1 somethingElse | no | no |
| R1 | no | YES |
| \s\sR2 | no | no |
| R3\s\s | no | YES |
| \tr4 | no | no |
| \t\sR5\t | no | no |
+-------------------------------+---------------+--------------+
Can anyone provide a working regex?
Mike V.
Upvotes: 2
Views: 145
Reputation: 89639
You can use this pattern:
(?i)^(?>[^r(]++|(?<!\\[ts])\Br|r(?![0-9])|(\((?>[^()]++|(?1))*\))|\()++$
Pattern details:
(?i) # modifier: case insensitive
^ # anchor: begining of the string
(?> # open an atomic group
[^r(]++ # all characters except r and opening parenthesis
| # OR
(?<!\\[ts])\Br # r without word boundary and not preceded by \t or \s
| # OR
r(?![0-9]) # r (with word boundary or preceded by \t or \s) not followed by a digit
| # OR
( # (nested or not parenthesis): open the capture group n°1
\( # literal: (
(?> # open an atomic group
[^()]++ # all characters except parenthesis
| # OR
(?1) # (recursion): repeat the subpattern of the capture group n°1
)* # repeat the atomic group (the last) zero or more times
\) # literal: )
) # close the first capturing group
| # OR
\( # for possible isolated opening parenthesis
)++ # repeat the first atomic group one or more times
$ # anchor: end of the string
Note: if in your post \t
and \s
are not literals, you can remove (?<!\\[ts])
.
Upvotes: 4