Reputation: 739
I need to extract the number 5 in the brackets in this HTML code:
<td class="th-clr-cel th-clr-td th-clr-pad th-clr-cel-dis" style="width:226px; text-align:left; ">
<span class="th-tx th-tx-value th-tx-nowrap" style="width:100%; " title="Social Insurance Number (SIN)" id="C29_W120_V121_builidnumber_table[5].type_text" f2="C;40">
Social Insurance Number (SIN)
</span>
This is just an extract of the whole HTML code and there is much more actual code before and after this sample. But one thing is for sure, the word "Insurance" only appears in this sample.
I managed to match whatever is between the 2 instances of "Social Insurance Number" with this regex:
((?<=Social Insurance Number)(.*)(?=Social Insurance Number))
Now I need to combine that and extract the number 5 within the square brackets.
Please note: the content of the bracket could be multiple chracters (i.e.: 15), but it will always be a numeral.
Thank you
EDIT: The reason I want to use regex to parse HTML is because this is part of a JMeter script to run mass performance tests on a system with hundreds of concurrent users. Performance is a major factor here and an XML parser will consume more resources than regex.
Upvotes: 0
Views: 1627
Reputation: 589
This will capture exactly digits under square brackets surrounded by term Insurance:
Insurance(?:[\s\S]*)\[(\d+)\](?:[\s\S]*)Insurance
Demo: https://regex101.com/r/hwFB0Y/3
Upvotes: 2
Reputation: 10719
Try this:
Insurance.*\[(\d+)\]
Or if you want to match it between the 2x "Insurance" words
Insurance.*\[(\d+)\][\s\S]+?Insurance
Where
Insurance
- Match the starting word "Insurance".*
- Match any character\[
- Match the opening bracket(\d+)
- Capture the numerical value inside brackets\]
- Match the closing bracket[\s\S]+?
- Match any character (including newlines) in a non-greedy way so that it wouldn't span across multiple "Insurance" wordsInsurance
- Match the ending word "Insurance"Upvotes: 1
Reputation: 317
Is that what you're looking for?
((?<=Social Insurance Number.*\[)(\d+)(?=\].*Social Insurance Number))
Upvotes: 1