Reputation: 1323
I'm parsing some messy HTML code with PHP in which there are some redundant
tags and I would like to clean them up a bit. For instance:
<br>
<br /><br />
<br>
How would I replace something like that with this using preg_replace()?:
<br /><br />
Newlines, spaces, and the differences between <br>
, <br/>
, and <br />
would all have to be accounted for.
Edit: Basically I'd like to replace every instance of three or more successive breaks with just two.
Upvotes: 4
Views: 8098
Reputation: 1824
Here is something you can use. The first line finds whenever there is 2 or more <br>
tags (with whitespace between and different types) and replace them with wellformated <br /><br />
.
I also included the second line to clean up the rest of the <br>
tags if you want that too.
function clean($txt)
{
$txt=preg_replace("{(<br[\\s]*(>|\/>)\s*){2,}}i", "<br /><br />", $txt);
$txt=preg_replace("{(<br[\\s]*(>|\/>)\s*)}i", "<br />", $txt);
return $txt;
}
Upvotes: 6
Reputation: 1555
This should work, using minimum specifier:
preg_replace('/(<br[\s]?[\/]?>[\s]*){3,}/', '<br /><br />', $multibreaks);
Should match appalling <br><br /><br/><br>
constructions too.
Upvotes: 5
Reputation: 2286
Use str_replace, its much better for simple replacement, and you can also pass an array instead of a single search value.
$newcode = str_replace("<br>", "", $messycode);
Upvotes: 0
Reputation: 32158
this will replace all breaks ... even if they're in uppercase:
preg_replace('/<br[^>]*>/i', '', $string);
Upvotes: 3