Alexandre Cavalcante
Alexandre Cavalcante

Reputation: 325

Encoding problems in Python - 'ascii' codec can't encode character '\xe3' when using UTF-8

I've created a program to print out some html content. My source file is in utf-8, the server's terminal is in utf-8, and I also use:

out = out.encode('utf8')

to make sure, the character chain is in utf8. Despite all that, when I use some characters like "ã", "é" in the string out, I get:

UnicodeEncodeError: 'ascii' codec can't encode character '\xe3' in position 84: ordinal not in range(128)

It seems to me that the print after:

print("Content-Type: text/html; charset=utf-8 \n\n")

It's being forced to use ASCII encoding... But, I just don't know this would be the case.

Upvotes: 0

Views: 2849

Answers (2)

Alexandre Cavalcante
Alexandre Cavalcante

Reputation: 325

Thanks a lot.

Here it goes how I've solved the encoding problem in with Python 3.4.1: First I've inserted this line in the code to check the output encoding:

print(sys.stdout.encoding)

And I saw that the output encoding was:

ANSI_X3.4-1968 -

which stands for ASCII and doesn't support characters like 'ã', 'é', etc.

so, I've deleted the previous line, and inserted theses ones here to change the standard output encoding with theses lines

import codecs

if sys.stdout.encoding != 'UTF-8':
    sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer, 'strict')
if sys.stderr.encoding != 'UTF-8':
    sys.stderr = codecs.getwriter('utf-8')(sys.stderr.buffer, 'strict')

Here is where I found the information:

http://www.macfreek.nl/memory/Encoding_of_Python_stdout

P.S.: everybody says it's not a good practice to change the default encoding. I really don't know about it. In my case it has worked fine for me, but I'm building a very small and simple webapp.

Upvotes: 4

hspandher
hspandher

Reputation: 16753

I guess you should read the file as unicode object, that way you might not need to encode it.

import codecs
file = codecs.open('file.html', 'w', 'utf-8')

Upvotes: 3

Related Questions