Reputation: 21
I used an anonymous pipe to capture all stdout,and stderr then print into a richedit, it's ok when i use wsprintf ,but the python using multibyte char that really annoy me. how can I convert all these output to unicode?
UPDATE 2010-01-03:
Thank you for the reply, but it seems the str.encode()
only worked with print xxx
stuff, if there is an error during the py_runxxx()
, my redirected stderr will capture the error message in multibyte string, so is there a way can make python output it's message in unicode way? And there seems to be an available solution in this post.
I'll try it later.
Upvotes: 2
Views: 3164
Reputation: 4315
wsprintf
?
This seems to be a "C/C++" question rather than a Python question.
The Python interpreter always writes bytestrings to stdout/stderr, rather than unicode (or "wide") strings. It means Python first encodes all unicode data using the current encoding (likely sys.getdefaultencoding()
).
If you want to get at stdout/stderr as unicode data, you must decode it by yourself using the right encoding.
Your favourite C/C++ library certainly has what it takes to do that.
Upvotes: -1
Reputation: 5029
You can work with Unicode in python either by marking strings as Unicode (ie: u'Hello World'
) or by using the encode() method that all strings have.
Eg. assuming you have a Unicode string, aStringVariable:
aStringVariable.encode('utf-8')
will convert it to UTF-8. 'utf-16' will give you UTF-16 and 'ascii' will convert it to a plain old ASCII string.
For more information, see:
Upvotes: 0
Reputation: 170758
First, please remember that on Windows console may not fully support Unicode.
The example below does make python output to stderr
and stdout
using UTF-8. If you want you could change it to other encodings.
#!/usr/bin/python
# -*- coding: UTF-8 -*-
import codecs, sys
reload(sys)
sys.setdefaultencoding('utf-8')
print sys.getdefaultencoding()
sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)
print "This is an Е乂αmp١ȅ testing Unicode support using Arabic, Latin, Cyrillic, Greek, Hebrew and CJK code points."
Upvotes: 9