bodacydo
bodacydo

Reputation: 79329

How to fix accidental encoding problems for some records in an MySQL database?

Over the years the databases I use for my research have been migrated from SyBase to MySQL to PostgreSQL back to MySQL.

It was done very carefully so that data wasn't broken because of various encoding issues but unfortunately a bunch of records did get damaged.

For example, one of the records says Jòzefina, but it should be Józefina.

Does anyone know if I could fix this particular encoding problem programmatically?

I'm not that strong in encodings, but it looks like I could somehow map the byte sequence ò to ó, and so on.

I wonder if anyone knows in which encoding ò corresponds to ó, so that I didn't have to manually create the encoding mapping table from broken text to correct text, but instead do it automatically.

Upvotes: 2

Views: 262

Answers (1)

Ruslan Osipov
Ruslan Osipov

Reputation: 5843

Change collation to unicode. Do this:

ALTER TABLE `t1` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

Your table is using Windows-1252. As you can see:

Dec 242
Hex F2
UTF-8 ò
Windows 1252 ò

Upvotes: 1

Related Questions