Salamandre
Salamandre

Reputation: 3

python regex unexpected end of pattern

My problem is quite simple. I want to parse a string like this one :

string = 'SENT (ADVWH Pourquoi) (NP (DET ce) (NC theme)) (PONCT ?)'

I want to use regex (I am not an expert, I have used it few times before). I want to extract the first level of brackets, i.e. I want the result to be :

(ADVWH Pourquoi)
(NP (DET ce) (NC theme))
(PONCT ?)

I used this regex, that I tested successfully on regex101, but it doesn't even want to compile :

re.compile(r"\(([^()]|(?R))*\)")

I also tried these ones that still work on regex101:

re.compile(r"\(([^\(\)]|(?R))*\)")
re.compile(r"\((([^\(\)]|(?R))*)\)")

I always get the same answer from python : unexpected end of pattern.

I really don't see what is the problem here, and why does it work on regex101 and not with python.

Thanks a lot in advance!

Upvotes: 0

Views: 78

Answers (1)

ACascarino
ACascarino

Reputation: 5080

re does not support recursion (the (?R) group) - you need to use the PyPi package regex

Upvotes: 1

Related Questions