Mirac7
Mirac7

Reputation: 1646

UnicodeEncodeError in python3

Some of my application's libraries are depending on being able to print UTF-8 characters to stdout and stderr. Therefore this must not fail:

print('\u2122')

On my local machine it works, but on my remote server it raises UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in position 0: ordinal not in range(128)

I tried $ PYTHONIOENCODING=utf8 with no apparent effect.

sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())

works for a while, then stalls and finally fails with ValueError: underlying buffer has been detached

sys.getdefaultencoding() returns 'utf-8', and sys.stdout.encoding returns 'ANSI_X3.4-1968'

What can I do? I don't want to edit third-party libraries.

Upvotes: 6

Views: 5378

Answers (2)

Mirac7
Mirac7

Reputation: 1646

From @ShadowRanger's comment on my question,

PYTHONIOENCODING=utf8 won't work unless you export it (or prefix the Python launch with it). Otherwise, it's a local variable in bash that isn't inherited in the environment of child processes. export PYTHONIOENCODING=utf-8 would both set and export it in bash.

export PYTHONIOENCODING=utf-8 did the trick, UTF-8 characters no longer raise UnicodeEncodeError

Upvotes: 6

ShadowRanger
ShadowRanger

Reputation: 155313

I'm guessing you're on a UNIX-like system, and your environment set LANG (or LC_ALL or whatever) to C.

Try editing your default shell's startup file to set LANG to something like en_US.utf-8 (or whatever locale makes sense for you)? For example, in bash, edit ~/.bash_profile (or ~/.profile if you're using that instead for sh compatibility) and add:

export LANG="en_US.utf-8"

For (t)csh, edit ~/.cshrc (or ~/.tcshrc if that's what you're using) to add:

setenv LANG "en_US.utf-8"

Making the changes "live" doesn't work, because your shell is likely hosted in a terminal that has configured itself solely for ASCII display, based on the LANG=C in effect when it was launched (and many terminals do session coalescence, so even if you changed LANG and then launched a new terminal, it would coalesce with the shared terminal process with the out-of-date LANG). So after you change ~/.bash_profile, log out and then log back in so your root shell will set LANG correctly for every other process (since they all ultimately fork from the root shell).

Upvotes: 0

Related Questions