XinYi Chua
XinYi Chua

Reputation: 281

Python - store chinese characters read from excel

I am trying to read in an excel sheet using xlrd, but I'm having some problems storing Chinese characters.

I am not sure why values get translated when I store it in a list:

Code:

for rownum in range(sh.nrows):
    Temp.append(sh.row_values(rownum))  

    print Temp

Output:

u'\u8bbe\u5168\u96c6\u662f\u5b9e\u6570\u96c6R\uff0cM= {x|-2<=x<=2}\uff0cN{x|x<1}\uff0c\u5219bar(M) nn N\u7b49\u4e8e

\n[A]\uff1a{x|x<-2}

[B]\uff1a {x|-2<1}

[C]\uff1a{x|x<1}

[D]\uff1a{x|-2<=x<1}'

However when I print out a single cell value, they are printed out correctly as per excel sheet:

Code:

 cell_test = sh.cell(1,3).value
 print cell_test

Output:

设全集是实数集R,M={x|-2<=x<=2}N={x|x<1},则bar(M) nn N等于

[A]:{x|x<-2}

[B]:{x|-2<1}

[C]:{x|x<1}

[D]:{x|-2<=x<1}

What should I do to get Python to store the above data at its original value?

Thanks!

Upvotes: 0

Views: 877

Answers (2)

Roman Bodnarchuk
Roman Bodnarchuk

Reputation: 29727

First. You XSL parser seem to return unicode values.

Second. When you do print some_complex_object (as you do print Temp), Python usually outputs the result of repr function on the elements of that object. And when you do print repr(some_unicode_string), the usual output is something like u'\u8bbe\u5168\u96c6\u662f'.

Third. There is nothing wrong with storing of the values - they are correctly stored, you just have problems with printing. Try something like:

for i in Temp:
    print i

Upvotes: 2

jfs
jfs

Reputation: 414395

The values should be the same. They are just displayed differently.

>>> s = u'o\ufb03ce'
>>> print s
office
>>> print [s]
[u'o\ufb03ce']
>>> print repr(s)
u'o\ufb03ce'
>>> print '\n'.join([s])
office

This example shows that when you print a list individual items are displayed using repr() function, but a string is displayed as is (unicode strings are encoded to bytes first).

Upvotes: 1

Related Questions