Reputation: 205
I have a html output that contains this:
<span class="value">
Price:<br>
<span style="color:white">23,07€ </span>
</span>
I tried to extract the prices using:
prices = re.findall(r'<span class="value">.*?(\d{1,3}\.?\d{1,2}).*?</span>',search_result)
sometimes the decimals are replaced with -- when there are 00, also i need this 2 numbers that get extracted by the expression 23 07 joined 2307
Thank you for your time.
Upvotes: 0
Views: 1249
Reputation: 46841
Get the matched group from index 1.
(?<=>)(\d[^€]*)
OR get the matched group index 1 and 2 for each number
(?<=>)(\d+)\D(\d+)\D
If you are interested only for <span>
tag then try below regex
<span [^>]*>(\d+)\D(\d+)\D[^<]*
Sample code:
import re
p = re.compile(ur'<span [^>]*>(\d+)\D(\d+)\D[^<]*')
test_str = u"..."
re.findall(p, test_str)
Upvotes: 1