Reputation: 3736
I'm having some problems using strip_tags PHP function when the string contains 'less than' and 'greater than' signs. For example:
If I do:
strip_tags("<span>some text <5ml and then >10ml some text </span>");
I'll get:
some text 10ml some text
But, obviously I want to get:
some text <5ml and then >10ml some text
Yes I know that I could use < and >, but I don't have chance to convert those characters into HTML entities since data is already stored as you can see in my example.
What I'm looking for is a clever way to parse HTML in order to get rid only actual HTML tags.
Since TinyMCE was used for generate that data, I know which actual html tags could be used in any case, so a strip_tags($string, $black_list)
implementation would be more usefull than strip_tags($string, $allowable_tags)
.
Any thoughs?
Upvotes: 8
Views: 7034
Reputation: 101
Following up on the accepted answer that uses a heuristic function to try to remove tags while sparing < and > signs, here is a version that uses preg_replace_callback, as the /e modifier in preg_replace is now deprecated:
function HTMLToString($string){
return htmlspecialchars_decode(strip_tags(preg_replace_callback("# <(?![/a-z]) | (?<=\s)>(?![a-z]) #xi",
function ($matches){
return (htmlentities($matches[0]));
}
, $string)));
}
Upvotes: 0
Reputation: 19251
Instead of strip_tags(), just use htmlspecialchars() instead.
http://php.net/manual/en/function.htmlspecialchars.php
Upvotes: 2
Reputation: 145482
As a wacky workaround you could filter non-html brackets with:
$html = preg_replace("# <(?![/a-z]) | (?<=\s)>(?![a-z]) #exi", "htmlentities('$0')", $html);
Apply strip_tags() afterwards. Note how this only works for your specific example and similar cases. It's a regular expression with some heuristics, not artificial intellegince to discern html tags from unescaped angle brackets with other meaning.
Upvotes: 6
Reputation: 92762
If you want to have "greater than" and "lesser than" signs, you need to escape them:
>
is >
<
is <
See e.g. this: http://www.w3schools.com/html/html_entities.asp
Upvotes: 4