tahoetomahawk
tahoetomahawk

Reputation: 11

Regex php find a character within html tag

I'm stuck on a stubborn problem I can't seem to solve.

I'm trying to find a specific character only when it is inside an html tag (not between).

To test this I have 2 test strings:

  1. a string with NO HTML. this is sentence 2.
  2. a string with some HTML. this is <a href="www.somesite.com">sentence</a>

I'd like to find all the period characters within < > html tags so the match should be 2 periods within www.somesite.com, I cannot get the match correctly. Can someone please take a look at my regex and see what I am missing?

(<[^>]*>?(\.))>?

Upvotes: 1

Views: 203

Answers (2)

Ahosan Karim Asik
Ahosan Karim Asik

Reputation: 3299

Try this:

$re = "/>[^<]*<(*SKIP)(*F)|searchText/mi";   //before | part avoid tag inner text and after | part search only tag inside text.
$str = "<div><a href=\"www.searchText.com\">This is <a href=\"www.searchText.com\">sentence</a> tI want to test.</a></div>";

preg_match_all($re, $str, $matches);

Demo

Upvotes: 1

Jay Blanchard
Jay Blanchard

Reputation: 34406

Given the string "This is <a href="www.somesite.com">sentence</a> I want to test." the regex:

\.(?=\w)

will match the periods in the URL but not at the end of the sentence. Note that the regex is not URL specific, it just finds a period followed immediately by a word character using a positive lookahead.

Having said that you should really be parsing HTML with something like PHPDomDocument

Upvotes: 0

Related Questions