Reputation: 1
I have this old database from postgresql version 8.4 that my boss told me to try to convert from SQL_ASCII encoding to UTF8 encoding.
I tried to pg_dump --encoding=UTF8
and I got the message "invalid byte sequence for encoding “UTF8” : 0xe76122"
And searching more i tried pg_dump --encoding=ISO88591
, and it worked, and I could also import it with no problems to the new empty UTF8 database that i created, but from time to time, i'm getting this message: "ERROR: character 0xc296 with encoding "UTF8" has no equivalent in "WIN1252";". Any solutions?
Upvotes: 0
Views: 766
Reputation: 246798
You made a mistake in converting the database.
It must have been encoded in WINDOWS-1252, not ISO 8859-1, and there must have been an “em dash” (Unicode U+2013, code point 96 in WINDOWS-1252).
When you dumped the database with encoding LATIN1
= ISO88591
and loaded it, the byte 0x96 was interpreted as Unicode U+0096, which is 0xC296 in UTF-8. This character does not exist in WINDOWS-1252, so the conversion fails.
You have to dump and restore the database again, but this time use
pg_dump --encoding=WIN1252
Since you also have code point 0x81 in your database, it could aso be WIN1251
(Cyrillic) or WIN1256
(Arabic). Or you have some wild mix – then you must fix the data before migration.
Upvotes: 3