Reputation: 2049
I'm trying to write a semi-advanced RegExp pattern to parse out some "macros" in some text. The pattern uses Named Groups and Conditional Statements.
A basic example of using both of them together would be something like:
(?<test>a)?b(?(test)c|d)
The first part (before the b), is matching for the letter a, assigning it to the named-group test
if it is successfully matched.
The second part (after the b), is the conditional statement, which basically reads:
If
test
was matched, then look for c, otherwise, look for d
My question is - Is it possible to have an OR in that condition at the end?
Here's an example pattern I wrote up to demonstrate what I'm trying to do. The pattern below looks for one of two named-groups, then has a conditional, matching for another character, if the first named-group was successfully matched:
(?:(?P<case1>a)?|(?P<case2>b)?)\|(?(case1)(?P<last>c)?)
And just to clarify what thats doing:
Open up a non-capturing group, with two patterns:
1.1. Match for the character a, assigning it to the named-group case1
if it is successfully matched
1.2. Match for the character b, assigning it to the named-group case2
if it is successfully matched
A conditional statement at the end, which reads:
If
case1
was successfully matched, then match for the character c, assigning it to the named-grouplast
if it is successfully matched
So, if you wanted to change it in such a way that step 2 instead would read:
If
case1
ORcase2
was successfully matched, then match for the character c, assigning it to the named-grouplast
if it is successfully matched
I have tried all of the following:
(?:(?P<case1>a)?|(?P<case2>b)?)\|(?(case1|case2)(?P<last>c)?)
(?:(?P<case1>a)?|(?P<case2>b)?)\|(?(?:(case1)|(case2))(?P<last>c)?)
(?:(?P<case1>a)?|(?P<case2>b)?)\|(?(case1,case2)(?P<last>c)?)
# Error (for 3 above): Invalid group structure, unmatched parenthesis
(?:(?P<case1>a)?|(?P<case2>b)?)\|(?:(?(case1)(?P<last>c)?)|(?(case2)(?P<last>c)?))
# Error: Subpattern name declared more than once
So I'm kinda lost as to what else to do. I created a Regex101.com instance with an example. You can see there's two lines in the Text String, and the pattern pulls out case1
and last
from the first line, then just case2
from the 2nd line - The goal is to capture last
in both lines
Thanks!
Upvotes: 2
Views: 1745
Reputation:
edit Updated for case3
No workaround necessary..
(Note- Conditionals don't require workarounds, they work one way.
No kludging other parts of the code to use them. Learn how to use them is the best option)
I think this is what you're trying to do
(?:(?P<case1>a)?|(?P<case2>b)?|(?P<case3>c)?)\|(?P<last>(?(case1)z?|(?(case2)z?)))
https://regex101.com/r/tH6pU0/6
Explained
(?:
(?P<case1> a )? # (1), Optional a
| (?P<case2> b )? # (2), Optional b
| (?P<case3> c )? # (3), Optional c
)
\| # Required |
(?P<last> # (4 start)
(?(case1) # Did case1 match
z? # yes, get optional z
| # or
(?(case2) # Did case2 match
z? # yes, get optional z
)
)
) # (4 end)
Upvotes: 1
Reputation: 43196
Regex doesn't have such a feature, no. But there are a few tricks/workarounds that can be used depending on the situation.
Workaround 1: If the two conditions are right next to each other, enclose them in another group:
(?P<case1_or_2>(?P<case1>a)|(?P<case2>b))
Workaround 2: Duplicate the then-pattern and else-pattern:
(?:(?(case1)c|d)|(?(case2)c|d))
(?:(?:(?P=case1)|(?P=case2))c|(?!(?P=case1))(?!(?P=case2))d)
Workaround 3 in more detail:
(?:
(?P<case1>)a # if "a" is matched, case1 captures an empty string
|
(?P<case2>)b # if "b" is matched, case2 captures an empty string
)? # if neither a nor b is matched, neither case matches at all
\|
(?: # if either case matched, match "c":
(?:
(?P=case1) # match either case1
|
(?P=case2) # or case2
)
c # followed by "c"
| # if neither case matched, match "d":
(?! # assert case1 didn't match
(?P=case1)
)
(?! # assert case2 didn't match either
(?P=case2)
)
d # match "d"
)
Upvotes: 2