Boyang
Boyang

Reputation: 2566

Basic Unicode encoding/decoding

Python 2.7.9 / Windows environment

when I

print myString

I'm seeing:

u'\u5df1\u6b66\u8d2a\u5929\u66f2'

Now I know the console I'm using (git-bash) is capable of displaying unicode. How can I encode (or decode, which ever is the right process to do) myString so that it displays:

己武贪天曲

I understand that the question is very basic. If anyone has good introductory material or reference, links would be most welcomed.

Upvotes: 1

Views: 118

Answers (3)

jfs
jfs

Reputation: 414079

What you see is the result of print repr(u'\u5df1\u6b66\u8d2a\u5929\u66f2'). If isinstancetype(myString, (str, unicode)) is true then find the source where the string is defined and fix it. If myString is some other type then look at how its __str__, __repr__, __unicode__ methods are defined. To fix it; remove the code that calls unnecessary repr() (it can hide as a formatting operation e.g., "%r" % o).

To check whether your environment supports Unicode, run: print u'\u5929'. It should produce .

If your input is a Python literal and you can't change it (you should try at the very least to switch it to json format) then you could use ast.literal_eval(r"u'\u5929'") to get unicode string object:

import ast

print ast.literal_eval(myString)

Upvotes: 3

dgsleeps
dgsleeps

Reputation: 700

You should try this:

message=u'\\u5df1\\u6b66\\u8d2a\\u5929\\u66f2'
print message.decode('unicode-escape')

I guess you are mising a "\" on every desired character

Upvotes: 0

Aakash
Aakash

Reputation: 83

You should use the encode method . Consider this example :

str='hello'
print(str.encode(encoding='base64'))

For the list of available encoding , check this :

https://docs.python.org/2/library/codecs.html#standard-encodings

Upvotes: -2

Related Questions