fancyzero
fancyzero

Reputation: 21

python, and unicode stderr

I used an anonymous pipe to capture all stdout,and stderr then print into a richedit, it's ok when i use wsprintf ,but the python using multibyte char that really annoy me. how can I convert all these output to unicode?

UPDATE 2010-01-03:

Thank you for the reply, but it seems the str.encode() only worked with print xxx stuff, if there is an error during the py_runxxx(), my redirected stderr will capture the error message in multibyte string, so is there a way can make python output it's message in unicode way? And there seems to be an available solution in this post.

I'll try it later.

Upvotes: 2

Views: 3164

Answers (3)

Antoine P.
Antoine P.

Reputation: 4315

wsprintf?

This seems to be a "C/C++" question rather than a Python question.

The Python interpreter always writes bytestrings to stdout/stderr, rather than unicode (or "wide") strings. It means Python first encodes all unicode data using the current encoding (likely sys.getdefaultencoding()).

If you want to get at stdout/stderr as unicode data, you must decode it by yourself using the right encoding.

Your favourite C/C++ library certainly has what it takes to do that.

Upvotes: -1

Adam Luchjenbroers
Adam Luchjenbroers

Reputation: 5029

You can work with Unicode in python either by marking strings as Unicode (ie: u'Hello World') or by using the encode() method that all strings have.

Eg. assuming you have a Unicode string, aStringVariable:

aStringVariable.encode('utf-8')

will convert it to UTF-8. 'utf-16' will give you UTF-16 and 'ascii' will convert it to a plain old ASCII string.

For more information, see:

Upvotes: 0

sorin
sorin

Reputation: 170758

First, please remember that on Windows console may not fully support Unicode.

The example below does make python output to stderr and stdout using UTF-8. If you want you could change it to other encodings.

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import codecs, sys

reload(sys)
sys.setdefaultencoding('utf-8')

print sys.getdefaultencoding()

sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)

print "This is an Е乂αmp١ȅ testing Unicode support using Arabic, Latin, Cyrillic, Greek, Hebrew and CJK code points."

Upvotes: 9

Related Questions