pablo07
pablo07

Reputation: 71

Python regex search

My previous example was not clear, I give another example :

a = '123 - 48 <!-- 456 - 251 - --> 452 - 348'

And if i do something like :

[el for el in re.split(r' - ',a)]

I catch :

['123', '48 <!-- 456', '251', '--> 452', '348']

But I want this :

['123', '48 <!-- 456 - 251 - --> 452', '348']

Thanks...

Upvotes: 1

Views: 498

Answers (3)

Lorenz Lo Sauer
Lorenz Lo Sauer

Reputation: 24740

The result you posted is of re.findall('(\d+)',a);

re.findall('(?:\<\!--.+\d+.+--\>)|(\d+)',a)

['123', '48', '', '452', '348']

filter(None, re.findall('(?:\<\!--.+\d+.+--\>)|(\d+)',a))

['123', '48', '452', '348']

Upvotes: -1

Qtax
Qtax

Reputation: 33928

If you want one regex you could use something like:

(\d+)(?!(?:[^<]+|<(?!!--))*-->)

As long as there are no "invalid" -->.

It matches numbers not followed by -->, without <!-- in between.

Upvotes: 0

fardjad
fardjad

Reputation: 20424

First remove the comments using something like this:

re.sub("<!--.*?-->", "", your_string)

then use your regex to extract numbers.

You can also use ?!... (negative lookahead assertion) but that won't be so simple.

Upvotes: 5

Related Questions