Reputation: 285
i'm using python 3, and want to validate emails sent to my inbox im using imaplib, i've managed to get the email content, however, the mail is unreadable and kind of corrupted ( variable html123 in code) after i'm fetching the mail, and getting the content using : mail_body = email.message_from_string(str(data[1][0][1], 'utf-8'))
this is the original mail i see in mailbox:
dear blabla, We’ve added new tasks to your account. Please log in to your account to review and.....
dear blabla, We=E2=80=99ve added new tasks to your account. Plea= se log in to your account....
so 3 issues in this example, i have much more in the real mail: 1 -the ' was replaced with =E2=80=99 2- the word please cut at end of line, with = 3 -all the signs\char || --- you see above
this is the relevant part in code:
data = self.mail_conn.fetch(str(any_email_id), f'({fetch_protocol})')
mail_body = email.message_from_string(str(data[1][0][1], 'utf-8'))
html123 = mail_body.get_payload()
x1 = (html2text.html2text(html123))
Upvotes: 0
Views: 64
Reputation: 689
The data you get from imaplib is in "quoted-printable" encoding. https://en.wikipedia.org/wiki/Quoted-printable
To decode you can use the builtin quopri module
import quopri
quopri.decodestring("we=E2=80=99ve").decode() # -> we've
Upvotes: 1