Reputation: 3587
String:
a = 10
b = 50
c = a + b
Regex:
([a-z]) = (\d+)|([a-z]) = ([a-z]) \+ ([a-z])
I want to match the first group pattern to the last 3 groups instead of it's value to avoid repeating it all over.
Something like
([a-z]) = (\d+)|\1 = \1 \+ \1
But instead of \1 evaluation to 'a' I want to see if is the same pattern.
Upvotes: 2
Views: 589
Reputation: 22478
If your GREP dialect supports it: use a Named conditional construction.
(?(<name>)then|else)
wherename
is the name of a capturing group andthen
andelse
are any valid regexes
(http://www.regular-expressions.info/refadv.html).
The following regex initially matches either an initial lowercase or a set of digits. The match gets stored into the local capturing group #2 (lowercase) or #3 (digits). Then, the conditional instruction ?(2)
tests if group #2 matched anything. If so, the first half of the rest of the regex is tested, if not, the second half is.
\l = ((\l)|(\d+))(?(2) \+ \l| \+ \d+)
On a short test list
a = 10 + 15
b = 50 + b
c = a + b
this will match the first and third line but not the second.
Upvotes: 1
Reputation: 336328
Some regex engines (for example PHP's PCRE engine, Perl and Ruby) support subroutines:
preg_match('/([a-z]) = (\d+)|((?1)) = ((?1)) \+ ((?1))/', $subject)
Note that in order to keep capturing the contents of those subroutines, you need an extra set of parentheses. So (?1)
acts as a "placeholder" for [a-z]
, and ((?1))
captures that in a new capturing group.
If your language's regex engine doesn't, you may still be able to use string manipulation to implement subpatterns, though. For example, in Python:
>>> import re
>>> letter = "([a-z])"
>>> regex = re.compile(r"{0} = (\d+)|({0}) = ({0}) \+ ({0})".format(letter))
Upvotes: 1