Reputation: 51
I am using Python to extract data from an MSSQL database, using an ODBC connection. I am then trying to put the extracted data into an Excel file, using xlwt.
However this generates the following error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd8 in position 20: ordinal not in range(128)
I have run the script to just print the data and established that the offending character in the database is an O with a slash through it. On the python print it shows as "\xd8".
The worksheet encoding for xlwt is set as UTF-8.
Is there any way to have this come straight through into Excel?
Edit
Full error message below:
C:\>python dbtest1.py
Traceback (most recent call last):
File "dbtest1.py", line 24, in <module>
ws.write(i,j,item)
File "build\bdist.win32\egg\xlwt\Worksheet.py", line 1032, in write
File "build\bdist.win32\egg\xlwt\Row.py", line 240, in write
File "build\bdist.win32\egg\xlwt\Workbook.py", line 309, in add_str
File "build\bdist.win32\egg\xlwt\BIFFRecords.py", line 25, in add_str
File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd8 in position 20: invalid
continuation byte
Upvotes: 1
Views: 2534
Reputation: 51
Setting the workbook encoding to 'latin-1' seems to have achieved the same:
wb = xlwt.Workbook(encoding='latin-1')
(It was set at 'UTF-8' before)
The other answer didn't work in my case as there were other fields that were not strings.
Upvotes: 4
Reputation: 8247
The SQL extraction seems to be returning strings encoded using ascii. You can convert them to unicode with:
data = unicode(input_string, 'latin-1')
You can then put them into a spreadsheet with xlwt.
Upvotes: -1