Geesh_SO
Geesh_SO

Reputation: 2205

Searching for outermost parentheses using Python regex

Apologies for the ambiguous title, but I don't know how to word my problem in such a way that makes sense in a single sentence.

So I have some simple regex code to extract code between brackets.

^.*\((.*)\).*

This successfully works in Python with the following code.

m = re.search( "^.*\((.*)\).*" ,input)
if m:
    print(m.groups()[0])

My problem occurs when a closing bracket ) may be inside the outermost brackets. For example, my current code when given

nsfnje (19(33)22) sfssf

as an input would return

19(33

but I would like it to return.

19(33)22

I'm not sure how to fix this, so any help would be appreciated!

Upvotes: 5

Views: 5210

Answers (2)

MikeM
MikeM

Reputation: 13641

You code does not give 19(33, it gives 33)22.

The problem is that the ^.* at the start of your regex matches all the way up to the last ( in the string, whereas you actually want to match from the first ( in the string.

If you just want what is within the outermost brackets, then remove the .* at the start of your regex, and you may as well remove the ending .* also as it similarly serves no purpose.

"\((.*)\)"

If you want the match of the whole line/string as well as what is within the brackets, then make the first * match lazily by adding a ?

"^.*?\((.*)\).*"

or better, use

"^[^(]*\((.*)\).*"

Upvotes: 0

NPE
NPE

Reputation: 500963

>>> input = "nsfnje (19(33)22) sfssf"
>>> re.search( "\((.*)\)" ,input).group(1)
'19(33)22'

Note that this searches for outermost parentheses, even if they are unbalanced (e.g. "(1(2)))))"). It is not possible to search for balanced parentheses using a single standard regular expression. For more information, see this answer.

Upvotes: 9

Related Questions