Chuck Aguilar
Chuck Aguilar

Reputation: 2048

utf-8 decoding error just in docker container

I'm getting crazy with this error.

I have this byte-string:

b'AXX 4 U I T T u N G AXK\nGlobus Fachm\xc3\xa4rkte GmbH & Co.KG\n\nBaumarkt 65719 Hofheim\n\nNordring 5\xe2\x80\x949\nTel :06192\xe2\x80\x94959680 Fax:95968444\n94\n\n#373\n2 SCHNITZEL 6.30 1\nSumme 6.30\nBAR 7.00\nR\xc3\xbcckgeld EUR 0.70\nMwSt Brutto 6.30\n19.00% MwSt Il 1.01\n\nNetto 5 29\n\n12000094 MW,\n\nVielen Dank f\xc3\xbcr Ihren Einkauf\nG\xc3\xbcltig zur Vorlage beim Finanzamt\nUST . I0\xe2\x80\x94Nummer : DE163646243\nAchtung: Bon vor W\xc3\xa4rme, N\xc3\xa4sse und\nSonneneinstrahlung sch\xc3\xbctzen.\n\nKasse/Bon Datum=lieferdatum Kassierer\n1619 12.01.17 12:45 56\n\x0c'

If I try to read it locally:

print(g.decode('utf-8').strip())

I get this:

AXX 4 U I T T u N G AXK
Globus Fachmärkte GmbH & Co.KG
Baumarkt 65719 Hofheim
Nordring 5—9
Tel :06192—959680 Fax:95968444
94
#373
2 SCHNITZEL 6.30 1
Summe 6.30
BAR 7.00
Rückgeld EUR 0.70
MwSt Brutto 6.30
19.00% MwSt Il 1.01
Netto 5 29
12000094 MW,
Vielen Dank für Ihren Einkauf
Gültig zur Vorlage beim Finanzamt
UST . I0—Nummer : DE163646243
Achtung: Bon vor Wärme, Nässe und
Sonneneinstrahlung schützen.
Kasse/Bon Datum=lieferdatum Kassierer
1619 12.01.17 12:45 56

but if I try to do it on the server, inside a Docker container, exactly the same code print(g.decode('utf-8').strip()) I get this error:

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "./main.py", line 140, in get_text_aux2
    text_blank = chuck_tesseract.image_to_string(blank_image, lang='deu', mode='1')
  File "./chuck_tesseract.py", line 137, in image_to_string
    print(g.decode('utf-8').strip())
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 36: ordinal not in range(128)

And it doesn't make any sense to me.

Maybe, the problem is inside the thread, or that I have to set something in the docker container.

I read this issue but it doesn't bring me anything.

Maybe someone had the same problem and can help me.

Upvotes: 0

Views: 2166

Answers (1)

Chuck Aguilar
Chuck Aguilar

Reputation: 2048

Yilun Zhang told me the answer.

RUN apt-get install locales...

RUN locale-gen de_DE.UTF-8

COPY ./default_locale /etc/default/locale
RUN chmod 0755 /etc/default/locale

ENV PYTHONIOENCODING=utf-8
ENV LC_ALL=de_DE.UTF-8
ENV LANG=de_DE.UTF-8
ENV LANGUAGE=de_DE.UTF-8

and default_locale:

LANG="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_CTYPE="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"LC_ALL=
LC_ALL=de_DE.UTF-8

and it's working perfectly.

Upvotes: 1

Related Questions