Steve.NayLinAung
Steve.NayLinAung

Reputation: 5155

Php: How to ignore newline in Regex

I've already found a lot of stackoverflow questions about this topic. But I cannot find out the solution out of these questions for my problem.

I have the following html:

<p><a name="first-title"></a></p>
<h3>First Title</h3>
<h2><a href='#second'>Second Title</a></h2>
<h3>Third Title</h3>

I want to find out the <h3> prepended by </a></p>. In this case, the output should be:

<h3>First Title</h3>

So I implement the following regular expression;

preg_match_all('/(?<=<\/a><\/p>)<h3>(.+?)<\/h3>/s',$html,$data);

The above regular expression cannot output anything from the above html. But if I remove the newlines from the html, the above regular expression can correctly output my desire result.

I would not like to remove newlines from the html if possible. How should I develop regular expression to ignore the newlines from the source string?

Please, help me.

Upvotes: 0

Views: 2676

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174696

Here comes the use of \K, since you can't use qunatifiers inside the lookaround assertions.

preg_match_all('/<\/a><\/p>\s*\K<h3>(.+?)<\/h3>/s',$html,$data);

or just put \n char inside the lookbehind.

preg_match_all('/(?<=<\/a><\/p>\n)<h3>(.+?)<\/h3>/s',$html,$data);

Upvotes: 4

Related Questions