Reputation: 46050
Is there anyway to make HTML Purifier preserve the implict spaces that would typically be seen in rendered HTML?
For example you would typically expect a space between Foo
and Bar
in these following cases:
Foo<br/>Bar
<div>Foo</div><div>Bar</div>
Upvotes: 4
Views: 1380
Reputation: 8621
Looks like HTMLPurifier is not removing whitespace, it's removing the tags all together because it doesn't recognize them (which is weird).
In regards to Foo<br/>Bar
Error Line 1, Column 3: Unrecognized <br /> tag removed
In regards to <div>Foo</div><div>Bar</div>
Error Line 1, Column 0: Unrecognized <div> tag removed
Error Line 1, Column 8: Unrecognized </div> tag removed
Error Line 1, Column 14: Unrecognized <div> tag removed
Error Line 1, Column 22: Unrecognized </div> tag removed
You can see this by enabling CollectErrors on the Live Demo.
Maybe try allowing div
, and br
: http://htmlpurifier.org/live/configdoc/plain.html#HTML.AllowedElements
Here are the results from the Live Demo:
Upvotes: 1
Reputation: 76
I have a cruel plan - replace any tag closing ">" to space and remove double spaces
<?php
$text = '<div>test</div><div>me</div>';
$text = preg_replace('/(<\/[a-z]+>)/', '$1 ', $text);
$text = trim(preg_replace('/\s+/', ' ', strip_tags($text)));
var_dump($text);
Returns
string(7) "test me"
Upvotes: 1