AlliDeacon
AlliDeacon

Reputation: 1495

Extracting pdf attachment from IMAP account -- python 3.5.2

Ok, so I'm trying to save pdf attachments sent to a specific account to a specific network folder but I'm stuck at the attachment part. I've got the following code to pull in the unseen messages, but I'm not sure how to get the "parts" to stay intact. I think I can maybe figure this out if I can figure out how to keep the email message complete. I never make it past "Made it to walk" output. All testing emails in this account include pdf attachments. Thanks in advance.

import imaplib
import email
import regex
import re

user = 'some_user'
password = 'gimmeAllyerMoney'

server = imaplib.IMAP4_SSL('mail.itsstillmonday.com', '993')
server.login(user, password)
server.select('inbox')

msg_ids=[]
resp, messages = server.search(None, 'UNSEEN')
for message in messages[0].split():
        typ, data = server.fetch(message, '(RFC822)')
        msg= email.message_from_string(str(data[0][1]))
        #looking for 'Content-Type: application/pdf
        for part in msg.walk():
                print("Made it to walk")
                if part.is_multipart():
                        print("made it to multipart")
                if part.get_content_maintype() ==  'application/pdf':
                        print("made it to content")

Upvotes: 1

Views: 3081

Answers (1)

jerry
jerry

Reputation: 539

You can use part.get_content_type() to get the full content type and part.get_payload() to get the payload as follows:

for part in msg.walk():
    if part.get_content_type() == 'application/pdf':
        # When decode=True, get_payload will return None if part.is_multipart()
        # and the decoded content otherwise.
        payload = part.get_payload(decode=True)

        # Default filename can be passed as an argument to get_filename()
        filename = part.get_filename()

        # Save the file.
        if payload and filename:
            with open(filename, 'wb') as f:
                f.write(payload)

Note that as tripleee pointed out, for a part with content type "application/pdf" you have:

>>> part.get_content_type()
"application/pdf"
>>> part.get_content_maintype()
"application"
>>> part.get_content_subtype()
"pdf"

Upvotes: 1

Related Questions