Carson
Carson

Reputation: 4651

MySQL is turning ™ into ª?

I'm exporting an excel file into a CSV and then uploading to a MySQL database, but every entry that has a ™ (trademark) turns into a small a (ª).

The database is set to utf8_unicode_ci, as is each row. Any ideas why this is happening still?

Upvotes: 0

Views: 256

Answers (3)

dan04
dan04

Reputation: 91227

If there's a 1-1 replacement in the mojibake, it's unlikely that UTF-8 is involved.

It appears that the original data was in one of the Macintosh encodings where '™' encodes to 0xAA, and got misinterpreted as windows-1252 (windows-1254 and -1258 and ISO-8859-1, -9, and -15 are also possible) where 0xAA decodes to 'ª'.

Upvotes: 1

Zkoh
Zkoh

Reputation: 2942

There's an easier way to do the conversion. If you use Windows, you can download a text editor like Notepad++ for free, and Notepad++ can encode or convert a text file into UTF-8 (on the menu bar, go to Encoding, switch to whichever one you want).

The same is also possible with a Mac editor like TextMate. File > Re-Open With Encoding.

Excel does indeed automatically encode a file generated from Excel in the Windows format. However, if it is used to open a file that uses a different encoding, it should preserve that encoding; it won't convert UTF-8 encoded files into Windows-1252.

Upvotes: 1

John Parker
John Parker

Reputation: 54445

The problem is that the Excel CSV file is (most likely) in Windows-1252 format.

As such, you'll most likely need to use PHP to convert each item into UTF-8 format using a function such as mb_convert_variables

For example:

$utfVersion = mb_convert_variables('UTF-8', 'Windows-1252', $windowsVersion);

Incidentally, it might still show up incorrectly if you view the MySQL table via the command line tools, etc., but it'll be fine once you retrieve it back into PHP.

Upvotes: 0

Related Questions