Simon Hume
Simon Hume

Reputation: 1012

Cleaning up nasty characters in PHP

Got a little issue where my client is pasting in content from Word into my little text editor in a CMS.

The double quotes are coming back encoded in what looks like some form of UTF.

Any ideas if I can strip/replace these using PHP when they get displayed out of my mySQL table.

Here is the link to the page that spits out the dodgy characters, you can see the 'black diamonds of doom' which are causing the headaches.

http://linq.milkbarstudios.com/news_detail.php?id=3

Any suggestions would be greatly accepted!

Upvotes: 2

Views: 756

Answers (2)

Simon Hume
Simon Hume

Reputation: 1012

I was actually looking for PHP to replace the dodgy characters.

in the end I found this, which fixes it perfectly:

$output = preg_replace('/[^(\x20-\x7F)]*/','', $output);

Upvotes: 0

Aaron Digulla
Aaron Digulla

Reputation: 328556

This sounds like a bug in your code. When handling text data, you must always consider the encoding and convert back and forth as necessary. So when the browser sends you UTF-8, you must decode the string before you send it to the database (MySQL does support UTF-8 in text columns). That way, the original text will be preserved. Of course, you must do the same when you render the page for the browser (set the charset to UTF-8, make sure you actually send UTF-8, etc).

Upvotes: 2

Related Questions