Shashank Mistry
Shashank Mistry

Reputation: 78

How to match the innermost parenthesis set using Python regex?

I'm trying to get the innermost sub-string for () in the below string using Python:

x = "a (b) (c (d) e"

what I want below sub-string as output

(d)

what I tried till now is as below

re.findall(r"\(.*?\)", x)
re.findall(r"\(.*\)", x)

but it gives me output as the outer strings and that is not useful. I want to match the innermost string which is available between ( ). This example is part of another complex string and this string aptly displays my issue. I require the regex solution only with the parentheses.

Upvotes: 2

Views: 1228

Answers (1)

Thomas Kimber
Thomas Kimber

Reputation: 11067

The regex I use for this purpose are:

(\([^\(]*?\))

And here's it demonstrated at regex101

i.e.

groups = [m for m in re.finditer(r"(\([^\(]*?\))",text)]

This returns all deepest-level bracketed groups in a string.

For example the string:

"(Mary ( had a (little) ) lamb)"

This regex returns "(little)".

In strings that contain separately bracketed sections, the regex will return all groups that are most deeply nested in their own locale.

e.g.

"(its) fleece (as (white) ) as snow."

Would return two groups, "(its)" and "(white)"

I use this for tokenising bracketed logic statements, and by tokenising only the deepest nested clauses at a time, replacing them with flattened tokens, I can iteratively parse an entire logic statement, until no remaining brackets are found.

It's worth ensuring the statement being parsed has all its brackets matched - e.g. in your opening statement, x = "a (b) (c (d) e" there's a missing bracket at the end - and should be closed off with a final ), such as x = "a (b) (c (d) e)".

Upvotes: 2

Related Questions