xjq233p_1
xjq233p_1

Reputation: 8060

Email body text?

hi everyone I am using a script which involves:

import oauth2 as oauth
import oauth2.clients.imap as imaplib
import email
conn = imaplib.IMAP4_SSL('imap.googlemail.com')
conn.debug = 4 

# This is the only thing in the API for impaplib.IMAP4_SSL that has 
# changed. You now authenticate with the URL, consumer, and token.
conn.authenticate(url, consumer, token)

# Once authenticated everything from the impalib.IMAP4_SSL class will 
# work as per usual without any modification to your code.
conn.select('[Gmail]/All Mail')

response, item_ids = conn.search(None, "SINCE", "01-Jan-2011")
item_ids = item_ids[0].split()

# Now iterate through this shit and retrieve all the email while parsing
# and storing into your whatever db.

for emailid in item_ids:
    resp, data = conn.fetch(emailid, "(RFC822)") 
    email_body = data[0][1] 
    mail = email.message_from_string(email_body) 

My current problem is that I can't seem to be able to retrieve the body of the mail instance. I am able to see the content of the email by printing it or mail.as_string() but then even with mail.keys() and mail.values() i am actually unable to see the mail's content (the main message).

What is wrong with this email lib API? (or rather what am I doing wrong)?

Upvotes: 2

Views: 735

Answers (1)

jfs
jfs

Reputation: 414149

From email docs:

You can pass the parser a string or a file object, and the parser will return to you the root Message instance of the object structure.

For simple, non-MIME messages the payload of this root object will likely be a string containing the text of the message. For MIME messages, the root object will return True from its is_multipart() method, and the subparts can be accessed via the get_payload() and walk() methods.

So use get_payload() or if the message is multipart then call walk() method and then use get_payload() on a desirable subpart.

Upvotes: 4

Related Questions