Reputation: 2658
Environment: Python 2.6 ... Python 2.higher-than-6
I have correct u''
UTF-8 strings that I need to change into ASCII coded format in standard Python 2.6-ish ASCII strings. Like so:
def conversionSolution(utf8StringInput):
{
...
return(asciiStringResult)
}
utf8string = u'\u5f00\u80c3\u83dc'
asciistring = conversionSolution(utf8string)
print asciistring
With ...
filled in, the above would print out...
and not...
Let me emphasize that I do not want the UTF-8 here; I specifically require 0-127 encoded ASCII backslash data that I can subsequently manipulate strictly as 7-bit ASCII.
Upvotes: 0
Views: 123
Reputation: 168706
def conversionSolution(utf8StringInput):
return repr(utf8StringInput)[2:][:-1]
utf8string = u'\u5f00\u80c3\u83dc'
asciistring = conversionSolution(utf8string)
print asciistring
Upvotes: 1
Reputation: 5751
You could call .encode('unicode-escape')
to do this.
That being said, you're talking about manipulating that string afterwards. There is not much useful you can do with that string afterwards. Eg. if you slice it you may slice in the middle of these escape sequences. Case folding of course doesn't work, etc. If you need to manipulate that string you should keep it as an unicode string.
Upvotes: 1