Reputation: 281
I am trying to read in an excel sheet using xlrd, but I'm having some problems storing Chinese characters.
I am not sure why values get translated when I store it in a list:
Code:
for rownum in range(sh.nrows):
Temp.append(sh.row_values(rownum))
print Temp
Output:
u'\u8bbe\u5168\u96c6\u662f\u5b9e\u6570\u96c6R\uff0c
M= {x|-2<=x<=2}
\uff0cN{x|x<1}
\uff0c\u5219bar(M) nn N
\u7b49\u4e8e\n[A]\uff1a
{x|x<-2}
[B]\uff1a
{x|-2<1}
[C]\uff1a
{x|x<1}
[D]\uff1a
{x|-2<=x<1}
'
However when I print out a single cell value, they are printed out correctly as per excel sheet:
Code:
cell_test = sh.cell(1,3).value
print cell_test
Output:
设全集是实数集R,
M={x|-2<=x<=2}
,N={x|x<1}
,则bar(M) nn N
等于[A]:
{x|x<-2}
[B]:
{x|-2<1}
[C]:
{x|x<1}
[D]:
{x|-2<=x<1}
What should I do to get Python to store the above data at its original value?
Thanks!
Upvotes: 0
Views: 877
Reputation: 29727
First. You XSL parser seem to return unicode
values.
Second. When you do print some_complex_object
(as you do print Temp
), Python usually outputs the result of repr
function on the elements of that object. And when you do print repr(some_unicode_string)
, the usual output is something like u'\u8bbe\u5168\u96c6\u662f'
.
Third. There is nothing wrong with storing of the values - they are correctly stored, you just have problems with printing. Try something like:
for i in Temp:
print i
Upvotes: 2
Reputation: 414395
The values should be the same. They are just displayed differently.
>>> s = u'o\ufb03ce'
>>> print s
office
>>> print [s]
[u'o\ufb03ce']
>>> print repr(s)
u'o\ufb03ce'
>>> print '\n'.join([s])
office
This example shows that when you print a list individual items are displayed using repr()
function, but a string is displayed as is (unicode strings are encoded to bytes first).
Upvotes: 1