Reputation: 1
I have to parse some xml output (from a request to a web site) like below bellow. They are partly in English, partly in French. I am not able to decode and to print (on screen, on file) the French accents like 'é' or 'à'
When I use decode('utf-8')
, I have a wrong result like 'è
'. I am using python 3.3.
b'Extr\xc3\x83\xc2\xaamement fort et incroyablement pr\xc3\x83\xc2\xa8s</title><originaltitle>Extremely Loud And Incredibly Close</originaltitle><year>2011</year><runtime>0</runtime><directors><director>Stephen Daldry</director></directors><plot>Oskar Schell, 11 ans, est un jeune New-Yorkais \xc3\x83\xc2\xa0 l\'imagination d\xc3\x83\xc2\xa9bordante. Un an apr\xc3\x83\xc2\xa8s la...</plot></movie></results>\n'
Upvotes: 0
Views: 163
Reputation: 140210
The byte string you pasted is double encoded,
byteStrInYourQuestion.decode('utf-8').encode("ISO-8859-1").decode("utf-8")
should work.
Upvotes: 5