Reputation:
I'm sure someone has covered this before, but I didn't find it in a quick search of the site. Right now I'm trying to filter some input from a WYSIWYG, so that it will remove characters like: ¢©÷µ·¶±€£®§™¥ but keep HTML characters. I've tried htmlentities and htmlspecialcharacters, but that still seems to leave those characters in tact. Any methods already present, or anybody have a good regex that would handle this? Thanks!
Upvotes: 1
Views: 703
Reputation: 5014
htmlentities()
and htmlspecialchars()
aren't going to work for you if you want to remove those characters completely, rather than just converting them to HTML entities.
EDIT
I just noticed that at one point you said you want to preserve HTML entities. If that's the case, use htmlentities()
!! It will convert all those symbols into their html entity equivalent. If you echo it, you're still going to see the characters you tried to remove, but if you view the source, you'll see the &name;
formatted entity instead.
You may need to use a regex for this, as sad as that is. Most PHP functions are trying to preserve those characters for you in one format or another. It's surprising that they're isn't a function to remove them, that I know of at least!
Upvotes: 0
Reputation: 3378
Have you tried the htmlentities()
function? Try like this:
$text = htmlentities($text);
There's some other optional parameters which you can check out at http://php.net/manual/en/function.htmlentities.php . You might have to set the quote_style
and charset
ones, at the very least.
Upvotes: 0
Reputation: 125
that regex should work:
$text = preg_replace('/[¢©÷µ·¶±€£®§™¥]*/', '', $text);
you could also replace the items like this:
$bad = array('©','®'); $good = array('©', '®');
$text = preg_replace($bad, $good, $text);
Upvotes: 0