IronPillow2
IronPillow2

Reputation: 53

Convert raw string (having escape characters) to unicode/utf8 string

In Python 3, how to convert an ASCII raw-string (that includes escape characters) into a proper unicode string?

As an example:

a = "ä"                         # note the umlaut
b = bytearray( a, "utf8" )      # yields: bytearray(b'\xc3\xa4')
s = r'\xc3\xa4'                 # note it's a raw string

In the example you can see how my source string s derives from the unicode string a, informed by b. The goal is to find a function, F, such that a == F(s). Thanks for your help!

I tried every combination of encode and decode and codecs that I could think of. Note, in particular, that the following yields False:

a == s.encode('latin-1').decode('unicode-escape')

Upvotes: 1

Views: 155

Answers (1)

Mark Ransom
Mark Ransom

Reputation: 308530

You were so close!

s.encode('latin-1').decode('unicode-escape').encode('latin-1').decode('utf-8')

Upvotes: 3

Related Questions