Reputation: 603
I have tried numerous ways to encode this to the end result "BACK RUSHIN'"
with the most important character being the right apostrophe '
.
I would like a way of getting to this end result using some of the built in functions Python has where there is no discrimination between a normal string and a unicode string.
This was the code I was using to retrieve the string: str(unicode(etree.tostring(root.xpath('path')[0],method='text', encoding='utf-8'),errors='ignore')).strip()
With the result being: 'BACK RUSHIN'
the thing being the apostrophe '
is missing.
Another way was: root.xpath('path/text()')
And that result was: u'BACK RUSHIN\u2019'
in python.
Lastly if I try: u'BACK RUSHIN\u2019'.encode('ascii', 'replace')
The result is: 'BACK RUSHIN?'
Please no replace functions, I would like to make use of pythons codec libraries. Also no printing the string because it is being held in a variable.
Thanks
Upvotes: 2
Views: 10590
Reputation: 799570
>>> import unidecode
>>> unidecode.unidecode(u'BACK RUSHIN\u2019')
"BACK RUSHIN'"
Upvotes: 18