giolekva
giolekva

Reputation: 1178

How can a recursive regexp be implemented in python?

I'm interested how can be implemented recursive regexp matching in Python (I've not found any examples :( ). For example how would one write expression which matches "bracket balanced" string like "foo(bar(bar(foo)))(foo1)bar1"

Upvotes: 15

Views: 9398

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627469

With PyPi regex, that you can easily install using pip install regex, you may use

import regex

pattern = r'[^()]*+(\((?>[^()]|(?1))*+\)[^()]*+)++'
text = 'foo(bar(bar(foo)))(foo1)bar1'
print(bool(regex.fullmatch(pattern, text)))
# => True

See the Python demo, see the regex pattern demo (note the \A and \z anchors are added in the demo because regex.fullmatch requires a full string match).

Pattern details

  • \A - implicit in regex.fullmatch - start of string
  • [^()]*+ - 0 or more chars other than ( and ) (possessive match, no backtracking into the pattern allowed)
  • (\((?>[^()]|(?1))*+\)[^()]*+)++ - 1+ occurrences of Group 1 pattern:
    • \( - a ( char
    • (?>[^()]|(?1))*+ - 1+ repetitions (possessive match) of
      • [^()] - any char but ( and )
      • | - or
      • (?1) - a regex subroutine that recurses Group 1 pattern
    • \) - a ) char
    • [^()]*+ - 0 or more chars other than ( and ) (possessive match)
  • \z - implicit in regex.fullmatch - end of string.

See the pattern and more information on regex subroutines at regular-expressions.info.

Upvotes: 2

unutbu
unutbu

Reputation: 880767

You could use pyparsing

#!/usr/bin/env python
from pyparsing import nestedExpr
import sys
astring=sys.argv[1]
if not astring.startswith('('):
    astring='('+astring+')'

expr = nestedExpr('(', ')')
result=expr.parseString(astring).asList()[0]
print(result)

Running it yields:

% test.py "foo(bar(bar(foo)))(foo1)bar1"
['foo', ['bar', ['bar', ['foo']]], ['foo1'], 'bar1']

Upvotes: 15

John La Rooy
John La Rooy

Reputation: 304463

You can't do it with a regexp. Python doesn't support recursive regexp

Upvotes: 4

Related Questions