Shahriar
Shahriar

Reputation: 13804

printing unicode for bengali

I'm using goslate for google translate API

I can translate Bengali to Engliash -

>>> import goslate

>>> gs = goslate.Goslate()
>>> S = gs.translate("ভাল", 'en')
>>> S

good

But, problem in arising when I want to translate English to Bengali.

>>> import goslate

>>> gs = goslate.Goslate()
>>> S = gs.translate("good", 'bn')
>>> S

Eoor:

return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-2:     character maps to <undefined>

What should I do?

print repr(S)
output: u'\u09ad\u09be\u09b2'

print("ভাল")
output: ভাল

print(u"ভাল") # this gives UnicodeEncodeError

Upvotes: 2

Views: 4042

Answers (2)

ForceBru
ForceBru

Reputation: 44878

This works for me

#coding: utf-8

from sys import setdefaultencoding, getdefaultencoding

d=getdefaultencoding()
if d != "utf-8":
    setdefaultencoding('utf-8')
st="ভাল"
f=open('test.txt','w')
f.write(st.encode('utf-8'))
f.close()
if d != "utf-8":
    setdefaultencoding(d)

This prints "ভাল" as expected. print st.encode('utf-8') works too.

Upvotes: 1

jfs
jfs

Reputation: 414605

It is definitely unrelated to goslate. Your issue is to make print u'\u09ad\u09be\u09b2' to work when the Unicode characters can't be represented using the console character encoding.

You either need to change the encoding to the one that can represent the Unicode characters such as utf-8 or use Unicode API such as WriteConsoleW assuming you are on Windows -- if you are not on Windows then just configure your environment to use utf-8.

WriteConsoleW usage is complicated though there is a simple to use win_unicode_console package on Python 3. The latter link also shows how to save the printed Unicode text to a file (print Unicode, set PYTHONIOENCODING).

Upvotes: 0

Related Questions