hakki
hakki

Reputation: 6519

Remove <p> tags which have less than two characters with preg_replace

I want to remove <p> </p> tags which has less than 2 characters. For example

$myText = "<p>hello world</p> <p>-</p> <p> </p>";

<p>-</p> and <p> </p> should removed from $myText; because of less than 2 chars.

my pattern is:

$output = preg_replace("/\b<p>[a-z]{1,2}</p>\b/","", $myText);

but when I echo $output I can't see anything. What is the problem here?

Upvotes: 1

Views: 185

Answers (5)

Christian
Christian

Reputation: 1577

Try this for the regex pattern: /<p>.{0,2}<\/p>/ You needed to escape the / in the closing </p>.

This code also checks for any characters not just alphabetical.

This code also checks for 2 or less (including 0). I added this in assuming you wanted it then re-read and realized you may not have wanted, but it's there if you do and easy to change if you don't :)

Upvotes: 1

HenryTK
HenryTK

Reputation: 1287

The other answers here are close, but you need to add a lazy modifier to your quantifier and m for multiline, in case your <p> tags span more than one line:

$html = "<p>hello world</p> <p>-</p> <p> </p>";
$replaced = preg_replace("/<p([^>]+?)>.{1,2}<\/p>/m","", $html);

Upvotes: 0

CrayonViolent
CrayonViolent

Reputation: 32517

$output = preg_replace("/\b<p>[a-z]{1,2}</p>\b/","", $myText);

You have a couple errors with this:

  • you use / as the delimiter but do not escape it in the pattern. You should have gotten an error from this but if you see nothing, you must have error reporting turned off. It's generally a good idea to turn error reporting on when developing
  • [a-z] only matches lowercase letters, not "anything"
  • You said less than 2 characters so it should match 0 or 1 characters. But your range matches for 1 or 2 characters
  • \b is unnecessary and may actually cause some things to not match, depending on the context
  • You do not account for it spanning multiple lines (which is likely to be the case for the real context.

Try this:

$output = preg_replace("~<p>.?</p>~s","", $myText);

Upvotes: 4

Andy Lester
Andy Lester

Reputation: 93725

Your sample text shows strings you want to remove "because of less than 2 chars"

$myText = "<p>hello world</p> <p>-</p> <p> </p>";

but your pattern is checking for one or two lowercase letters.

$output = preg_replace("/\b<p>[a-z]{1,2}</p>\b/","", $myText);

Your examples do not have lowercase letters in them.

Upvotes: 0

Rizier123
Rizier123

Reputation: 59701

Your first problem is, that you have to escape your slash in your regex, otherwise it will think that this is your delimiter and you should get a warning:

Warning: preg_replace(): Unknown modifier 'p'

So you probably don't have error reporting turned on.

Second you want to remove tags like: <p>-</p> and <p> </p>, but you only allow a-z between the two tags.

So change your code to something like this:

<?php

    $myText = "<p>hello world</p> <p>-</p> <p> </p>";
    $output = preg_replace("/<[^>]+>.{1,2}<\/[^>]+>/","", $myText); 

    highlight_string($output);

?>

output:

<p>hello world</p>  

Upvotes: 1

Related Questions