Reputation: 325
I've created a program to print out some html content. My source file is in utf-8, the server's terminal is in utf-8, and I also use:
out = out.encode('utf8')
to make sure, the character chain is in utf8. Despite all that, when I use some characters like "ã", "é" in the string out, I get:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe3' in position 84: ordinal not in range(128)
It seems to me that the print after:
print("Content-Type: text/html; charset=utf-8 \n\n")
It's being forced to use ASCII encoding... But, I just don't know this would be the case.
Upvotes: 0
Views: 2849
Reputation: 325
Thanks a lot.
Here it goes how I've solved the encoding problem in with Python 3.4.1: First I've inserted this line in the code to check the output encoding:
print(sys.stdout.encoding)
And I saw that the output encoding was:
ANSI_X3.4-1968 -
which stands for ASCII and doesn't support characters like 'ã', 'é', etc.
so, I've deleted the previous line, and inserted theses ones here to change the standard output encoding with theses lines
import codecs
if sys.stdout.encoding != 'UTF-8':
sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer, 'strict')
if sys.stderr.encoding != 'UTF-8':
sys.stderr = codecs.getwriter('utf-8')(sys.stderr.buffer, 'strict')
Here is where I found the information:
http://www.macfreek.nl/memory/Encoding_of_Python_stdout
P.S.: everybody says it's not a good practice to change the default encoding. I really don't know about it. In my case it has worked fine for me, but I'm building a very small and simple webapp.
Upvotes: 4
Reputation: 16753
I guess you should read the file as unicode object, that way you might not need to encode it.
import codecs
file = codecs.open('file.html', 'w', 'utf-8')
Upvotes: 3