Reputation: 11
I'm stuck on a stubborn problem I can't seem to solve.
I'm trying to find a specific character only when it is inside an html tag (not between).
To test this I have 2 test strings:
this is <a href="www.somesite.com">sentence</a>
I'd like to find all the period characters within < > html tags so the match should be 2 periods within www.somesite.com, I cannot get the match correctly. Can someone please take a look at my regex and see what I am missing?
(<[^>]*>?(\.))>?
Upvotes: 1
Views: 203
Reputation: 3299
Try this:
$re = "/>[^<]*<(*SKIP)(*F)|searchText/mi"; //before | part avoid tag inner text and after | part search only tag inside text.
$str = "<div><a href=\"www.searchText.com\">This is <a href=\"www.searchText.com\">sentence</a> tI want to test.</a></div>";
preg_match_all($re, $str, $matches);
Upvotes: 1
Reputation: 34406
Given the string "This is <a href="www.somesite.com">sentence</a> I want to test.
" the regex:
\.(?=\w)
will match the periods in the URL but not at the end of the sentence. Note that the regex is not URL specific, it just finds a period followed immediately by a word character using a positive lookahead.
Having said that you should really be parsing HTML with something like PHPDomDocument
Upvotes: 0