Reputation: 1210
I received json
data that has some unicode characters escaped, and others not.
>>> example = r'сло\u0301во'
What is the best way to unescape those characters? In the example below, what would the function unescape
look like? Is there a built-in function that does this?
>>> unescape(example)
сло́во
Upvotes: 1
Views: 1474
Reputation: 1210
This solution assumes that every instance of \u
in the original string is a unicode escape:
def unescape(in_str):
"""Unicode-unescape string with only some characters escaped."""
in_str = in_str.encode('unicode-escape') # bytes with all chars escaped (the original escapes have the backslash escaped)
in_str = in_str.replace(b'\\\\u', b'\\u') # unescape the \
in_str = in_str.decode('unicode-escape') # unescape unicode
return in_str
...or in one line...
def unescape(in_str):
"""Unicode-unescape string with only some characters escaped."""
return in_str.encode('unicode-escape').replace(b'\\\\u', b'\\u').decode('unicode-escape')
Upvotes: 1