Reputation: 5155
I've already found a lot of stackoverflow questions
about this topic. But I cannot find out the solution out of these questions for my problem.
I have the following html:
<p><a name="first-title"></a></p>
<h3>First Title</h3>
<h2><a href='#second'>Second Title</a></h2>
<h3>Third Title</h3>
I want to find out the <h3>
prepended by </a></p>
. In this case, the output should be:
<h3>First Title</h3>
So I implement the following regular expression;
preg_match_all('/(?<=<\/a><\/p>)<h3>(.+?)<\/h3>/s',$html,$data);
The above regular expression cannot output anything from the above html. But if I remove the newlines from the html, the above regular expression can correctly output my desire result.
I would not like to remove newlines from the html if possible. How should I develop regular expression to ignore the newlines from the source string?
Please, help me.
Upvotes: 0
Views: 2676
Reputation: 174696
Here comes the use of \K
, since you can't use qunatifiers inside the lookaround assertions.
preg_match_all('/<\/a><\/p>\s*\K<h3>(.+?)<\/h3>/s',$html,$data);
or just put \n
char inside the lookbehind.
preg_match_all('/(?<=<\/a><\/p>\n)<h3>(.+?)<\/h3>/s',$html,$data);
Upvotes: 4