Reputation: 1646
I need to create and connect to a database PostgreSQL 9.2 using SQLAlchemy. So far, I am able to create the full db in UTF-8, but I have trouble putting non-ASCII characters into it. This is how I connect to the db:
url = URL(drivername=s'postgresql', username='uname', password='pwd', host='localhost', port='5432', database='postgres')
self.engine = create_engine(url)
Then I create the new db, switch to it, and start to populate it: everything is ok. I get this:
entercursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (DataError) invalid byte sequence for encoding "UTF8": 0xec2d43
'INSERT INTO province (codice_regione, codice, tc_provincia_id, nome, sigla) VALUES (%(codice_regione)s, %(codice)s, %(tc_provincia_id)s, %(nome)s, %(sigla)s) RETURNING province.id' {'nome': 'Forl\xec-Cesena', 'codice': 40, 'codice_regione': 8, 'tc_provincia_id': 34, 'sigla': 'FC'}
I have the same code for the same db on MySQL 5, it works perfectly. I don't know what is wrong. I registered the extension of postgres for unicode, but this does not work. I am puzzled, I need the help of somebody more experienced.
Upvotes: 1
Views: 2063
Reputation: 23890
Make sure, that your data, which can contain international characters, are Unicode strings. A string 'Forl\xec-Cesena'
which you try to insert, is in Latin1
(ISO-8859-1
) encoding. So use
unicode('Forl\xec-Cesena','Latin1')
to convert it to unicode string.
Upvotes: 1
Reputation: 61506
The 0xec2d43
sequence corresponds in iso-8859-1 to the 3 characters ì-C
which would be part of the name 'Forlì-Cesena', according to the error log.
So the program is sending valid iso-8559-1, not UTF-8, while the server expects UTF-8.
The simplest way to fix the problem is to inform the server about the actual encoding, by issuing at the client side this SQL statement:
SET client_encoding=latin1;
Either that or convert the data to UTF-8 before passing it to the database, which is @Tometzky's answer.
Upvotes: 3