shahzad
shahzad

Reputation: 21

Python encoding issue

i m working with some python script, got a raw string with UTF8 encoding. first of all i decoded it to utf8 then some processing is done and at the end i encode it back to utf8 and inserted to DB(mysql) but chars in DB are not presented in real format.

str = '<term>Beiträge</term>'
str = str.decode('utf8')
...
...
...
str = str.encode('utf8')

after that string is found in txt file in its real form but in MYSQL_DB, i found it like this

 <term>"Beiträge</term>

any idea why this happened? :-(

Upvotes: 2

Views: 409

Answers (2)

Halberdier
Halberdier

Reputation: 1204

To make a string a Unicode string you should use the stringprefix 'u'. See also here http://docs.python.org/reference/lexical_analysis.html#literals

Maybe your example works by just adding the prefix in the initial assignment.

Upvotes: 0

marr75
marr75

Reputation: 5715

Assuming you are using the MySQLdb library, you need to create connections using the keyword arguments:

use_unicode If True, text-like columns are returned as unicode objects using the connection's character set. Otherwise, text-like columns are returned as strings. columns are returned as normal strings. Unicode objects will always be encoded to the connection's character set regardless of this setting.

&

charset If supplied, the connection character set will be changed to this character set (MySQL-4.1 and newer). This implies use_unicode=True.

You should also check the encoding of your db tables.

Upvotes: 1

Related Questions