user1357159
user1357159

Reputation: 319

Regex, find first - Python

i="<wx._controls.Button; proxy of <Swig Object of type 'wxButton *' at 0x2887828> >]], [[[41, 183], 'Button', <wx._controls.Button; proxy of <Swig Object of type 'wxButton *' at 0x28879d0> >]]]"

m = re.findall("<wx.(.*)> >", i)

will give me

["<wx._controls.Button; proxy of <Swig Object of type 'wxButton *' at 0x2887828> >]], [[[41, 183], 'Button', <wx._controls.Button; proxy of <Swig Object of type 'wxButton *' at 0x28879d0> >"]

However I want it to give me,

["<wx._controls.Button; proxy of <Swig Object of type 'wxButton *' at 0x2887828> >","<wx._controls.Button; proxy of <Swig Object of type 'wxButton *' at 0x28879d0> >"]

The regex is searching all the way until the end, I would like to take all the parts out that match the regex, Does anyone know a solution to this?

Upvotes: 4

Views: 5013

Answers (1)

Niklas B.
Niklas B.

Reputation: 95298

the * operator is greedy by default. You can change this by adding a ? after it. Also remember to quote the literal dot.

I also made the group non-matching, otherwise you wouldn't get the desired output (this seems to be a problem with your original code as well):

re.findall(r"<wx\.(?:.*?)> >", i)

Another possiblity would be the following (assuming that exactly one < character comes before the first >), which is faster than the version with the lazy * operator:

re.findall(r"<wx\.[^<]*<[^<]*> >", i)

Upvotes: 7

Related Questions