Regex match pattern not value

Question

String:

a = 10
b = 50
c = a + b

Regex:

([a-z]) = (\d+)|([a-z]) = ([a-z]) \+ ([a-z])

I want to match the first group pattern to the last 3 groups instead of it's value to avoid repeating it all over.

Something like

([a-z]) = (\d+)|\1 = \1 \+ \1

But instead of \1 evaluation to 'a' I want to see if is the same pattern.

Tim Pietzcker · Accepted Answer

Some regex engines (for example PHP's PCRE engine, Perl and Ruby) support subroutines:

preg_match('/([a-z]) = (\d+)|((?1)) = ((?1)) \+ ((?1))/', $subject)

Note that in order to keep capturing the contents of those subroutines, you need an extra set of parentheses. So (?1) acts as a "placeholder" for [a-z], and ((?1)) captures that in a new capturing group.

If your language's regex engine doesn't, you may still be able to use string manipulation to implement subpatterns, though. For example, in Python:

>>> import re
>>> letter = "([a-z])"
>>> regex = re.compile(r"{0} = (\d+)|({0}) = ({0}) \+ ({0})".format(letter))

Regex match pattern not value

Answers (2)

Related Questions