Reputation: 357
I'm using sublime text 3 for python coding. And I have some problems with Cyrillic encoding.
Firstly I've had problems even with building(running) any file with Cyrillic in it. But I'd found solution to made build config as follow:
[cmd: ['python3', '-u', '-c', "import sys; import codecs; sys.stdout = codecs.getwriter( 'utf-8' )( sys.stdout.detach() ); exec( compile( open( r'/.../ducksearch.py', 'rb' ).read(), r'/.../ducksearch.py', 'exec'), globals(), locals() )"]]
[dir: /.../crowler]
[path: /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin]
So now it's ok for me, it's running py files with Cyrillic strings in it well. But when i'm trying to write file in Cyrillic it fails again with such message:
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 197: ordinal not in range(128)
At the same time same script goes well in both python3 cmd and ipython env. So it seems that the problem is in sublime build system or in my cfg for it. So could you please tell what should i do to make it works fine?
Here's my code:
utf_8_text = html.unescape(html_entities)
print(utf_8_text)
fi = open('./tmp/tmp.html', 'w')
try:
fi.write(utf_8_text)
except Exception as e:
raise e
finally:
fi.close()
here's some input text example:
Культура, по мнению Ерофеева, есть дистанция между человеком, таким как он есть, и тем образом, в котором он себя видит. Неадекватность - причина смеха и слез, иронии итрагедии, определяющая ход истории, человеческое существование. В новой книге Ерофеева мир человека, культура, литератураThe original input is in the html entities, not Cyrillic actually, it's stackoverflow converts it so:
Культура, по мнению Ерофеева, есть дистанция между человеком, таким как он есть, и тем образом, в котором он себя видит. Неадекватность - причина смеха и слез, иронии итрагедии, определяющая ход истории, человеческое существование. В новой книге Ерофеева мир человека, культура, литература
Upvotes: 0
Views: 311
Reputation: 357
I've found the solution. The problem were that in sublime text python environment there were no Russian localisation enabled. So for now my build config for python looks as follow (point that there is no such awful args coming to interpretator call). And everything is goes well both output in console and writing in file.
{
"cmd": ["python3", "-u", "$file"],
"env": {"LANG": "ru_RU.UTF-8"},
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python"
}
Upvotes: 1