Reputation: 671
--- update ---
I think this console log nails the issue, however it's still not clear how to fix it:
>>> workbook = openpyxl.load_workbook('data.xlsx')
>>> worksheet = workbook.active
>>> worksheet['A2'].value
u'\u041c\u0435\u0448\u043e\u043a \u0434\u0435\u043d\u0435\u0433'
>>> print worksheet['A2'].value
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-4: character maps to <undefined>
--- end update ---
I'm trying to print the values of some .xlsx cells using openpyxl:
import openpyxl
workbook = openpyxl.load_workbook(filename='puzzles.xlsx')
worksheet = workbook.active
for row in worksheet.iter_rows('A2:K5'):
print row[0].value
Which results in the following error:
Traceback (most recent call last):
File "xls_import.py", line 8, in <module>
print row[0].value
File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-4: character maps to <undefined>
As far as I know, XLSX is encoded as UTF-8, however:
print row[0].value.decode('utf-8')
does not help either:
Traceback (most recent call last):
File "xls_import.py", line 8, in <module>
print row[0].value.decode('utf-8')
File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)
Any suggestions?
I'm running Python 2.7 and openpyxl 2.2.5.
Upvotes: 1
Views: 2658
Reputation: 19557
openpyxl returns unicode strings (XML itself is encoded in UTF-8) so you don't need to decode them (decoding goes from an encoding to unicode) but encode them in encoding of your choice.
Upvotes: 1