user12401846
user12401846

Reputation:

Unicode text excel python 2.7

my parse is russain text excel, I use xlrd

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
from __future__ import print_function
import cx_Oracle
import csv
import xlrd
loc = ("parse.xls")
wb = xlrd.open_workbook(loc, encoding_override="cp1251") 
sheet = wb.sheet_by_index(0) 
sheet.cell_value(0, 0) 
print(sheet.row_values(4))

I have a problem russian text, my result:

[u'\u041a\u0430\u0440-\u0422\u0435\u043b ', u'\u0421\u0443\u0449', u'44061AKBAKAY', 1.0, u'']

excel: enter image description here

Upvotes: 0

Views: 42

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177675

Python 2 displays list contents as a debug representation using repr(). Print individual items to view in printing representation using str(), but this can still fail if the terminal code page and font both do not support Cyrillic:

>>> items = [u'\u041a\u0430\u0440-\u0422\u0435\u043b ', u'\u0421\u0443\u0449', u'44061AKBAKAY', 1.0, u'']
>>> items
[u'\u041a\u0430\u0440-\u0422\u0435\u043b ', u'\u0421\u0443\u0449', u'44061AKBAKAY', 1.0, u'']
>>> for item in items:
...  print(item)
...
Кар-Тел
Сущ
44061AKBAKAY
1.0

Switch to Python 3, which is better at displaying Unicode strings and (at least on Windows) doesn't care about the terminal code page. The font still has to support Cyrillic:

>>> items = [u'\u041a\u0430\u0440-\u0422\u0435\u043b ', u'\u0421\u0443\u0449', u'44061AKBAKAY', 1.0, u'']
>>> items
['Кар-Тел ', 'Сущ', '44061AKBAKAY', 1.0, '']

Upvotes: 1

Related Questions