Reputation: 5217
I want to match following pattern: match all uppercase letter-only words in brackets and inside <b></b>
tags.
Example:
(ABC) 'must extract none
<b>(ABC) 'must extract none
<b>(ABC)(CDE)(EFG)</b> 'must extract ABC, CDE and EFG
<b> shr (ABC) апаd (CDE) lgsgs </b> 'must extract ABC and CDE
<b>A</b>(ABCA)<b>(ABCB)</b> 'must extract only ABCB
<b>A</b>(ABCA)<b>dada(ABCB)wsg</b> 'must extract only ABCB
<b>AB</b>(ABCA)<b>BC</b>(ABCB) 'must extract none
I tried to use following pattern, but it matches only first occurrence:
"(<b>(?:(?!<\/?b>).)*?\()([A-Z]+)(\)(?:(?!<\/?b>).)*<\/b>)"
Upvotes: 0
Views: 75
Reputation: 174696
You could try the below regex.
(?:[A-Z]+(?=\)))(?=(?:(?!<\/?b>).)*<\/b>)
(?:[A-Z]+(?=\)))
It would match one or more uppercase letters only if it's followed by a closing )
bracket.
(?=(?:(?!<\/?b>).)*<\/b>)
And aslo it must be followed by any character but not of opening or closing <b>
tag zero or more times and then it must be followed by a closing </b>
tag.
OR
Simply like this,
(?:[A-Z]+(?=\)))(?=[^<>]*<\/b>)
Upvotes: 2