Reputation: 1178
I'm interested how can be implemented recursive regexp matching in Python (I've not found any examples :( ). For example how would one write expression which matches "bracket balanced" string like "foo(bar(bar(foo)))(foo1)bar1"
Upvotes: 15
Views: 9398
Reputation: 627469
With PyPi regex, that you can easily install using pip install regex
, you may use
import regex
pattern = r'[^()]*+(\((?>[^()]|(?1))*+\)[^()]*+)++'
text = 'foo(bar(bar(foo)))(foo1)bar1'
print(bool(regex.fullmatch(pattern, text)))
# => True
See the Python demo, see the regex pattern demo (note the \A
and \z
anchors are added in the demo because regex.fullmatch
requires a full string match).
Pattern details
\A
- implicit in regex.fullmatch
- start of string[^()]*+
- 0 or more chars other than (
and )
(possessive match, no backtracking into the pattern allowed)(\((?>[^()]|(?1))*+\)[^()]*+)++
- 1+ occurrences of Group 1 pattern:
\(
- a (
char(?>[^()]|(?1))*+
- 1+ repetitions (possessive match) of
[^()]
- any char but (
and )
|
- or(?1)
- a regex subroutine that recurses Group 1 pattern\)
- a )
char[^()]*+
- 0 or more chars other than (
and )
(possessive match)\z
- implicit in regex.fullmatch
- end of string.See the pattern and more information on regex subroutines at regular-expressions.info.
Upvotes: 2
Reputation: 880767
You could use pyparsing
#!/usr/bin/env python
from pyparsing import nestedExpr
import sys
astring=sys.argv[1]
if not astring.startswith('('):
astring='('+astring+')'
expr = nestedExpr('(', ')')
result=expr.parseString(astring).asList()[0]
print(result)
Running it yields:
% test.py "foo(bar(bar(foo)))(foo1)bar1"
['foo', ['bar', ['bar', ['foo']]], ['foo1'], 'bar1']
Upvotes: 15
Reputation: 304463
You can't do it with a regexp. Python doesn't support recursive regexp
Upvotes: 4