Reputation: 29993
I have a word in Polish as a string variable which I need to print to a file:
# coding: utf-8
a = 'ilośc'
with open('test.txt', 'w') as f:
print(a, file=f)
This throws
Traceback (most recent call last):
File "C:/scratches/scratch_3.py", line 5, in <module>
print(a, file=f)
File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u015b' in position 3: character maps to <undefined>
Looking for existing answers (with .decode("utf-8")
, or with .encode("utf-8")
) and trying various incantations I finally managed the file to be created.
Unfortunately what was written was b'ilośc'
and not ilośc
. When I tried to decode that before printing to the file, I got back to the initial error and the same traceback.
How to write a str
containing diacritics to a file so that it is a string and not a bytes representation?
Upvotes: 1
Views: 488
Reputation: 414405
The traceback says that you are trying to save 'ś'
('\u015b'
) character using cp1252
encoding (the default is locale.getpreferredencoding(False)
—your Windows ANSI code page) that can't represent this Unicode character (there more than a million Unicode characters and cp1252 is a single-byte encoding that can represent only 256 characters).
Use a character encoding that can represent the desired characters:
with open(filename, 'w', encoding='utf-16') as file:
print('ilośc', file=file)
Upvotes: 1
Reputation: 2174
a = 'ilośc'
with open('test.txt', 'w') as f:
f.write(a)
You can even write to the file using the binary mode:
a = 'ilośc'
with open('test.txt', 'wb') as f:
f.write(a.encode())
Upvotes: 1