Reputation: 79329
Over the years the databases I use for my research have been migrated from SyBase to MySQL to PostgreSQL back to MySQL.
It was done very carefully so that data wasn't broken because of various encoding issues but unfortunately a bunch of records did get damaged.
For example, one of the records says Jòzefina
, but it should be Józefina
.
Does anyone know if I could fix this particular encoding problem programmatically?
I'm not that strong in encodings, but it looks like I could somehow map the byte sequence ò
to ó
, and so on.
I wonder if anyone knows in which encoding ò
corresponds to ó
, so that I didn't have to manually create the encoding mapping table from broken text to correct text, but instead do it automatically.
Upvotes: 2
Views: 262
Reputation: 5843
Change collation to unicode. Do this:
ALTER TABLE `t1` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
Your table is using Windows-1252. As you can see:
Dec 242
Hex F2
UTF-8 ò
Windows 1252 ò
Upvotes: 1