Reputation: 121992
I have an unicode string κανω
but due to some preprocessing from some other software that I can't change it becomes a literal string '\u03ba\u03b1\u03bd\u03c9'
instead of u'\u03ba\u03b1\u03bd\u03c9'
.
How could I change '\u03ba\u03b1\u03bd\u03c9'
back to u'\u03ba\u03b1\u03bd\u03c9'
?
I've tried:
>>> x = '\u03ba\u03b1\u03bd\u03c9'
>>> print x
\u03ba\u03b1\u03bd\u03c9
>>> print x.decode('utf8')
\u03ba\u03b1\u03bd\u03c9
>>> print x.encode('utf8')
\u03ba\u03b1\u03bd\u03c9
>>> print unicode(x)
\u03ba\u03b1\u03bd\u03c9
I cannot possibly go to each string output and add the u'...'
, i.e. I need to avoid doing this:
>>> x = u'\u03ba\u03b1\u03bd\u03c9'
>>> print x
κανω
Upvotes: 1
Views: 109
Reputation: 107287
You need 'unicode_escape'
(Produce a string that is suitable as Unicode literal in Python source code) as its encoding :
>>> s='\u03ba\u03b1\u03bd\u03c9'
>>> print unicode(s,'unicode_escape')
κανω
Upvotes: 4