Reputation: 12521
On an Ubuntu 12.04 VM (set up using vagrant and the hashicorp/precise64
box), my locale says that I have the UTF-8
language, but python is getting a latin-1
environment.
Here's what I'm seeing:
vagrant@vagrant:~$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US"
LC_NUMERIC="en_US"
LC_TIME="en_US"
LC_COLLATE="en_US"
LC_MONETARY="en_US"
LC_MESSAGES="en_US"
LC_PAPER="en_US"
LC_NAME="en_US"
LC_ADDRESS="en_US"
LC_TELEPHONE="en_US"
LC_MEASUREMENT="en_US"
LC_IDENTIFICATION="en_US"
LC_ALL=en_US
vagrant@vagrant:~$ python
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print u'\u1f41'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u1f41' in position 0: ordinal not in range(256)
How can I get a true utf-8 system environment for python?
Upvotes: 2
Views: 4006
Reputation: 133879
The locale for LC_CTYPE
ought to be en_US.UTF-8
in locale
output. Try
export LC_ALL="en_US.UTF-8"
and if it does not work (as in LC_CTYPE
set explicitly), also:
export LC_CTYPE="en_US.UTF-8"
Upvotes: 4