Reputation: 4818
I know this question has been asked thousand of time, but I am near to a nervous break so I can't help but to ask for help.
I have an email with french accents caractères. The sentence is :
"céline : Berlin Annette : 0633'.
The email package of python changes
':' on '=3A'
"é" on "=E9".
How to get back to the accent ?? and to the "=" sign ?
I tried several things looking through the net :
getting the payload :
>>> z = msg.get_payload()
>>> z
'C=E9line =3A Berlin Annette =3A 0633'
>>> infos(z)
(<type 'str'>, ' 'C=E9line =3A Berlin Annette =3A 0633')
decoding it by its charset:
>>> z = msg.get_payload().decode(msg.get_content_charset())
>>> z
u' C=E9line =3A Berlin Annette =3A 0633'
>>> infos(z)
(<type 'unicode'>, u' 'C=E9line =3A Berlin Annette =3A 0633')
or encoding it in utf_8 after decoding:
>>> z = msg.get_payload().decode(msg.get_content_charset()).encode('utf-8')
>>> z
'C=E9line =3A Berlin Annette =3A 0633'
>>> infos(z)
(<type 'str'>, 'C=E9line =3A Berlin Annette =3A 0633')
I also tried urllib:
urllib.unquote(z)
'C=E9line =3A 00493039746784 Berlin Annette =3A 0633'
nothing seems to work :(
Upvotes: 5
Views: 8552
Reputation: 368904
You can use quopri.decodestring
to decode the string.
>>> quopri.decodestring('C=E9line =3A 00493039746784 Berlin Annette =3A 0633')
'C\xe9line : 00493039746784 Berlin Annette : 0633'
If you pass decode=True
to Message.get_payload
, it will do above for you:
msg.get_payload(decode=True)
Upvotes: 9