Python: raw_unicode_escape doesn't write raw value for "é"

Question

Got a lovely script that is printing out a bunch of text in raw unicode to handle all the different language.

the script works fine in ascii carater and non latin based languages (Hindi, Chinese etc.)

However it failes to print out the raw values for characters such as "é" "è"...

instead of printing the raw unicode value \u00E9 in print "é" in the file which in turn displays a diamond interrogation mark on the webpage.

f = codecs.open(newFilePathAndName(path,filename,language),encoding='raw_unicode_escape', mode='w')
...
f.write(outputString)

when I do a "print" in my script it displays the caracters é as \xe9

any ideas ?

the only that pops to mind is to put a regex that replace \xe by \u00

Martijn Pieters · Accepted Answer

The raw_unicode_escape encoding indeed does not provide escapes for values below 0xFF; these values are not normally escaped in a raw python unicode literal.

Use the unicode_escape encoding instead:

>>> print u'\u00e9'.encode('unicode_escape')
\xe9

Python: raw_unicode_escape doesn't write raw value for "é"

Answers (1)

Related Questions

Python: raw_unicode_escape doesn&#39;t write raw value for &quot;&#233;&quot;

Answers (1)

Related Questions

Python: raw_unicode_escape doesn't write raw value for "é"