Reputation: 223
My database is UTF-8 (PostgreSQL). I saved 'TESTµTEST' into the database and it's OK. But when I selected this value from the database I saw 'TESTµTEST'.
Moreover, when I made a request select * from tbl where f='TESTµTEST'
, I got this error:
ERROR: invalid byte sequence for encoding "UTF8": 0xb5.
Would you please give me any solutions?
Upvotes: 0
Views: 6966
Reputation: 7274
I ran into the same error when copying tables into PostgreSQL 9.1 which contained the same symbol (tables from the standard nutrient database v26). I recreated the database with the new encoding, but I also had to specify the appropriate locale and template.
CREATE DATABASE testdb
WITH OWNER = postgres
ENCODING = 'LATIN1'
LC_COLLATE = 'eng_canada.28591'
LC_CTYPE = 'eng_canada.28591'
TEMPLATE = template0;
Upvotes: 0
Reputation: 324375
That error shows that you are trying to decode latin-1 text as if it were utf-8. Most likely your client_encoding
setting in PHP doesn't match the encoding of the data you're actually sending.
The string "TESTµTEST" is produced by encoding data from Unicode to a utf-8 byte sequence, then decoding it as latin-1. You can see this in psql:
regress=# select convert_from(convert_to('TESTµTEST','utf-8'),'latin-1');
convert_from
--------------
TESTµTEST
If the PostgreSQL database were utf-8
it would convert latin-1
input to utf-8
if client_encoding
was correctly set to latin-1
. If client_encoding
is incorrectly set to utf-8
and you send latin-1
encoded data, PostgreSQL will refuse to accept it with the message:
invalid byte sequence for encoding "UTF8": 0xb5
... which is what happens when you run that SELECT
you showed. So - I'd say your client is set to client_encoding = 'utf-8'
but your PHP scripts are actually sending latin-1
data. I expect that's because, as @dezso says, you're editing your PHP scripts with a text editor that's using the latin-1
encoding.
To find out which encoding PHP is using, use a PHP database connection to run SHOW client_encoding;
.
To show the database encoding, run:
SELECT d.datname, pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding"
FROM pg_database d WHERE datname = 'my_db_name_here';
Oh, another possibility is that Apache (or whatever) expects your PHP scripts to be utf-8
encoded, but they're actually latin-1
encoded files.
Upvotes: 3