Reputation: 1323

How to remove redundant tags from HTML code using PHP?

I'm parsing some messy HTML code with PHP in which there are some redundant
tags and I would like to clean them up a bit. For instance:

<br>

<br /><br /> 


<br>

How would I replace something like that with this using preg_replace()?:

<br /><br />

Newlines, spaces, and the differences between  ,  , and   would all have to be accounted for.

Edit: Basically I'd like to replace every instance of three or more successive breaks with just two.

Upvotes: 4

Answers (5)

H9kDroid

Reputation: 1824

Here is something you can use. The first line finds whenever there is 2 or more   tags (with whitespace between and different types) and replace them with wellformated  .

I also included the second line to clean up the rest of the   tags if you want that too.

function clean($txt)
{
    $txt=preg_replace("{(<br[\\s]*(>|\/>)\s*){2,}}i", "<br /><br />", $txt);
    $txt=preg_replace("{(<br[\\s]*(>|\/>)\s*)}i", "<br />", $txt);
    return $txt;
}

Upvotes: 6

Karl Andrew

Reputation: 1555

This should work, using minimum specifier:

preg_replace('/(<br[\s]?[\/]?>[\s]*){3,}/', '<br /><br />', $multibreaks);

Should match appalling   constructions too.

Upvotes: 5

Matthew Riches

Reputation: 2286

Use str_replace, its much better for simple replacement, and you can also pass an array instead of a single search value.

$newcode = str_replace("<br>", "", $messycode);

Upvotes: 0

Teneff

Reputation: 32158

this will replace all breaks ... even if they're in uppercase:

preg_replace('/<br[^>]*>/i', '', $string);

Upvotes: 3

hsz

Reputation: 152216

Try with:

preg_replace('/<br\s*\/?>/', '', $inputString);

Upvotes: 0

How to remove redundant <br /> tags from HTML code using PHP?

Answers (5)

Related Questions

How to remove redundant &lt;br /&gt; tags from HTML code using PHP?

Answers (5)

Related Questions

How to remove redundant <br /> tags from HTML code using PHP?