libertaire
libertaire

Reputation: 855

PHP regex: remove everything between the last occurrence of <br> and a string

In a text, I want to remove everything between the last occurrence of <br> and a string.

Let's say I have this text:

Lorem ipsum dolor sit amet, <br> 
consectetur adipisicing elit, <br> 
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <br> 
Some text I want to remove because it is useless.

I want to remove everything between the last <br> and "useless." (including the delimiters).

The expected result would be:

Lorem ipsum dolor sit amet, <br> 
consectetur adipisicing elit, <br> 
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Upvotes: 0

Views: 551

Answers (3)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

You can do it like this:

$txt = preg_replace('~<br>(?>[^<u]++|<(?!br>)|u(?!seless))*(?>useless\.?|$)~', '', $txt);

interest:

  • few backtracks
  • few lookahead tests (only when a < or an u is find)
  • dotall mode is useless

Upvotes: 0

tchow002
tchow002

Reputation: 1098

$modified_text = preg_replace('/^(.*)<br>(.*)$/s', '$1', $original_text);

This should create a modified_text variable that contains everything up to the last <br> in original_text.

Upvotes: 1

falsetru
falsetru

Reputation: 369034

$text = <<< EOD
Lorem ipsum dolor sit amet, <br>
consectetur adipisicing elit, <br>
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <br>
Some text I want to remove because it is useless.
EOD;

echo(preg_replace('/(?s)<br>(?!.*<br>).*useless/', '', $text));

Above code prints:

Lorem ipsum dolor sit amet, <br>
consectetur adipisicing elit, <br>
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. .

Use negative lookahead lookup (?!.*<br>) to find last <br>.

Upvotes: 1

Related Questions