Reputation: 49817
Is there a function to remove all non UTF-8 characters from a string?
Upvotes: 2
Views: 14840
Reputation: 449385
If you have a UTF-8 string that might contain invalid characters, you can use iconv
to remove those. This should work:
$text = iconv("utf-8", "utf-8//ignore", $text);
Making them visible with an arbitrary placeholder is a bit tougher - I can't think of any easy way to do that, short of walking through every byte and see whether it's a valid character. The Wikipedia article provides more info on how to do that.
Upvotes: 9