Reputation: 15166
I consider myself still a newbie to regex and have the following challenge:
My users post content that contain one or more "line breaks" at the end. These "line breaks" are <p><br></p>
with varying amounts of whitespace between the tags. Sometimes, more than one <br>
is in each paragraph. Some examples:
<p>
<br>
</p>
<p>
<br>
</p>
<p><br> <br>
</p>
<p>
<br>
</p>
How can I remove these paragraphs from the end of each piece of content, while also removing the contained <br>
s, spaces, line breaks, and tabs?
Upvotes: 1
Views: 157
Reputation: 18785
<?php
$strings[] = 'foo<p>
<br>
</p>';
$strings[] = 'foo<p>
<br>
</p>';
$strings[] = 'foo<p><br><br>
</p>';
$strings[] = 'foo<p>
<br>
</p>';
foreach($strings as $string){
// \s* matches any number of whitespace characters (" ", \t, \n, etc)
// (?:...)+ matches one or more (without capturing the group)
// $ forces match to only be made at the end of the string
$string = preg_replace("/(?:<p>\s*(?:<br>\s*)+<\/p>\s*)+$/", "", $string);
echo $string."\n---\n";
}
Output is:
foo
---
foo
---
foo
---
foo
---
Upvotes: 1