Reputation: 92099
I get the following string from database:
'23:45 \xe2\x80\x93 23:59'
and the output should look like
'23:45 - 23:59'
How can I decode this? I tried utf-8 decoding but no luck
>>> x.decode("utf-8")
u'23:45 \u2013 23:59'
Thank you
Upvotes: 1
Views: 2812
Reputation: 73
a="NOV–DEC 2011" (en-dash)
b=unidecode(a)
#output --> NOV-DEC 2011 (with hyphen)
You need to install unidecode first, and import it. I've tried it and it runs well. Hope it helps!
Upvotes: 1
Reputation: 3956
The UTF-8 representation of an "en dash" http://www.fileformat.info/info/unicode/char/2013/index.htm is hex 0xE2 0x80 0x93 (e28093), or u"\u2013". It sounds like you want to replace the en-dash character with an ascii hyphen/minus (0x2d) to store it in the variable. That's OK, but the variable won't contain the same character that is stored in the database, any more than if you replaced a Ü ( http://www.fileformat.info/info/unicode/char/dc/index.htm ) with an ascii U, or replaced a zero (0x30) with a capital O (0x4f).
Upvotes: 1
Reputation: 318568
This is completely correct. The interactive python interpreter displaye the repr()
of the string. If you want to see it as a proper string, print
it:
>>> print '23:45 \xe2\x80\x93 23:59'
23:45 – 23:59
Upvotes: 7