Reputation: 1012
Got a little issue where my client is pasting in content from Word into my little text editor in a CMS.
The double quotes are coming back encoded in what looks like some form of UTF.
Any ideas if I can strip/replace these using PHP when they get displayed out of my mySQL table.
Here is the link to the page that spits out the dodgy characters, you can see the 'black diamonds of doom' which are causing the headaches.
http://linq.milkbarstudios.com/news_detail.php?id=3
Any suggestions would be greatly accepted!
Upvotes: 2
Views: 756
Reputation: 1012
I was actually looking for PHP to replace the dodgy characters.
in the end I found this, which fixes it perfectly:
$output = preg_replace('/[^(\x20-\x7F)]*/','', $output);
Upvotes: 0
Reputation: 328556
This sounds like a bug in your code. When handling text data, you must always consider the encoding and convert back and forth as necessary. So when the browser sends you UTF-8, you must decode the string before you send it to the database (MySQL does support UTF-8 in text columns). That way, the original text will be preserved. Of course, you must do the same when you render the page for the browser (set the charset to UTF-8, make sure you actually send UTF-8, etc).
Upvotes: 2