Andrew G
Andrew G

Reputation: 827

pyodbc returns unicode symbols replaced by latin

When connecting to Transbase source using Python and pyodbc (connection string is correct and works in other applications), I saw that symbols like ó, ű, é, á convert to o, u, e, a.

But if I connect to the same source from MS Access via ODBC, these symbols are shown correctly. And if I connect from pyodbc to MS Access (link with Transbase source in mdb-file) the symbols are shown correctly.

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\1.mdb;Persist Security Info=False

CHARSET=uft8 does not help

How can I change connection string or other parameters to get these symbols displayed correctly?

Upvotes: 3

Views: 844

Answers (1)

slurry
slurry

Reputation: 719

There are a few variables in your question that make a good answer difficult. What character encoding is used by the database you're querying is important to know. It appears that your replacing characters that cant be found with equivilent latin characters.

Encoding the return from your query in the same encoding as the database, or in an encoding that has the corrisponding characters for ó, ű, é, á would be a start. So:

query_result = cursor.execute(sql)
data = query_result[0].encode('utf8',errors='strict')

errors='strict' will cause an error if an the character is not found in the encoding you've choosen (utf8 in the above example). That may help you find the correct coding of your database.

From the results your getting it looks like the code you have in place now is the equivelent of:

data = query_result[0].encode('utf8',errors='replace')

Which would replace characters it can't find with "suitable" substitutes. So ó is getting replaced with just "0"

Upvotes: 1

Related Questions