Unicode issue with makemessages --all Django 1.6.2 Python 3.3

Question

Upgrading a project from Python 2.7 -> 3.3.1 and Django 1.4 -> 1.6.2.

After updating code our application is running again (in py3).
Translations are being pulled from .mo files.

The only issue is that our old .po files cannot be used with

django-admin.py makemessages -a

It displays a lovely

UnicodeDecodeError: 'ascii' codec can't decode byte...

We can run makemessages for the first time and get the skeleton files. Once we add any translations that are non-ASCII (ǹ, è, etc.) to the msgstr values, makemessages fails to complete.
(If we run makemessages with higher verbosity templates with any NON-ASCII characters are skipped.)

I've found bug reports for a similar issue, but they're back in the 1.3.x versions, but nothing really for the above versions.

Update, more info:

Here's where the exception is happening:
../python3.3/subprocess.py line 847

def _translate_newlines(self, data, encoding):
    data = data.decode(encoding)
    return data.replace("
", "
").replace("
", "
")

The encoding value is ANSI_X3.4-1968. I've saved the template files as UTF-8, along with the .po file(s).

Here's is the .po header (just the skeleton created automatically from makemessages):

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR , YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION
"
"Report-Msgid-Bugs-To: 
"
"POT-Creation-Date: 2014-02-28 22:42+0000
"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE
"
"Last-Translator: FULL NAME 
"
"Language-Team: LANGUAGE 
"
"Language: 
"
"MIME-Version: 1.0
"
"Content-Type: text/plain; charset=UTF-8
"
"Content-Transfer-Encoding: 8bit
"
"Plural-Forms: nplurals=2; plural=(n != 1);
"

This worked before (same files) under Python 2.7 and Django 1.5

Update #2

List item
Created a new naked project (django-admin.py startproject blah)
Enabled i18n, etc.
Created a single translation (just in settings.py)
Ran `makemessages -l de
Py2.7 (#python manage.py makemessages -a) works as expected
Py3.3 (#python3 manage.py makemessages -a) fails

Possibly filing a bug, will update.

Justin · Accepted Answer

This threw me off as everything was fine with Py2, but not Py3, so I assumed that's where the problem would be.

The issue was in part because I'm using Docker, and running makemessages from within the container which didn't have the locale set to anything in particular for bash.

Things I tried:

Saving files with UTF-8 (with and without BOM)
Ensuring I had UTF-8 in the header of the .po files
Creating a new blank slate project
Re-creating all .po files using Py3 (as they were originally created with Py2)

The top exception was being thrown in subprocess.py on this line 847:

def _translate_newlines(self, data, encoding):
    data = data.decode(encoding)
    return data.replace("
", "
").replace("
", "
")

The encoding being passed in was ANSI_X3.4-1968, which was weird as I had saved the files as UTF-8 etc (it was being set as ANSI... due to my bash session not have a locale set specifically).

Answer
In my Docker container I had no locale settings set in the terminal, so they were:

# locale
LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

These were my available locales (missing my specific locale, en_US.UTF-8, but as long as it's UTF-8 I'm okay):

# locale -a
C
C.UTF-8
POSIX

Placed this in ~/.bashrc:

export LC_ALL=C.UTF-8
export LANG=C.UTF-8
export LANGUAGE=C.UTF-8

And now I get UTF-8 as the content type within subprocess.py, and everything works with Py3/Django1.6 =)

long story short, I was caught off guard that Django/subprocess.py is using the environment locale, and not the encoding of the file / or the header Content-Type.

Unicode issue with makemessages --all Django 1.6.2 Python 3.3

Answers (1)

Related Questions