daydreamer
daydreamer

Reputation: 92099

python decoding string issues

I get the following string from database:

'23:45 \xe2\x80\x93 23:59'  

and the output should look like

'23:45 - 23:59'  

How can I decode this? I tried utf-8 decoding but no luck

>>> x.decode("utf-8")
u'23:45 \u2013 23:59'

Thank you

Upvotes: 1

Views: 2812

Answers (3)

rassel pratomo
rassel pratomo

Reputation: 73

a="NOV–DEC 2011" (en-dash)
b=unidecode(a)

#output --> NOV-DEC 2011 (with hyphen)

You need to install unidecode first, and import it. I've tried it and it runs well. Hope it helps!

Upvotes: 1

Dave
Dave

Reputation: 3956

The UTF-8 representation of an "en dash" http://www.fileformat.info/info/unicode/char/2013/index.htm is hex 0xE2 0x80 0x93 (e28093), or u"\u2013". It sounds like you want to replace the en-dash character with an ascii hyphen/minus (0x2d) to store it in the variable. That's OK, but the variable won't contain the same character that is stored in the database, any more than if you replaced a Ü ( http://www.fileformat.info/info/unicode/char/dc/index.htm ) with an ascii U, or replaced a zero (0x30) with a capital O (0x4f).

Upvotes: 1

ThiefMaster
ThiefMaster

Reputation: 318568

This is completely correct. The interactive python interpreter displaye the repr() of the string. If you want to see it as a proper string, print it:

>>> print '23:45 \xe2\x80\x93 23:59'
23:45 – 23:59

Upvotes: 7

Related Questions