Reputation: 823
I need to iterate over all the mail into a GMAIL inbox. Also I need to download all the attachments for each mail (some mails have 4-5 attachments). I found some helps here : https://stackoverflow.com/a/27556667/8996442
def save_attachments(self, msg, download_folder="/tmp"):
for part in msg.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
filename = part.get_filename()
print(filename)
att_path = os.path.join(download_folder, filename)
if not os.path.isfile(att_path):
fp = open(att_path, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
return att_path
But, it download only one attachment per e-mail (but the author of the post mention that norammly it download all, no?).
The print(filename)
show me only one attachment
Any idea why ?
Upvotes: 0
Views: 1432
Reputation: 6726
from imap_tools import MailBox
# get all attachments from INBOX and save them to files
with MailBox('imap.my.ru').login('acc', 'pwd', 'INBOX') as mailbox:
for msg in mailbox.fetch():
for att in msg.attachments:
print(att.filename, att.content_type)
with open('/my/{}/{}'.format(msg.uid, att.filename), 'wb') as f:
f.write(att.payload)
https://pypi.org/project/imap-tools/
*I am lib author
Upvotes: 1
Reputation: 189337
As already pointed out in comments, the immediate problem is that return
exits the for
loop and leaves the function, and you do this immediately when you have saved the first attachment.
Depending on what exactly you want to accomplish, change your code so you only return
when you have finished all iterations of msg.walk()
. Here is one attempt which returns a list of attachment filenames:
def save_attachments(self, msg, download_folder="/tmp"):
att_paths = []
for part in msg.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
filename = part.get_filename()
# Don't print
# print(filename)
att_path = os.path.join(download_folder, filename)
if not os.path.isfile(att_path):
# Use a context manager for robustness
with open(att_path, 'wb') as fp:
fp.write(part.get_payload(decode=True))
# Then you don't need to explicitly close
# fp.close()
# Append this one to the list we are collecting
att_paths.append(att_path)
# We are done looping and have processed all attachments now
# Return the list of file names
return att_paths
See the inline comments for explanations of what I changed and why.
In general, avoid print()
ing stuff from inside a worker function; either use logging
to print diagnostics in a way that the caller can control, or just return the information and let the caller decide whether or not to present it to the user.
Not all MIME parts have a Content-Disposition:
; in fact, I would expect this to miss the majority of attachments, and possibly extract some inline parts. A better approach is probably to look whether the part has Content-Disposition: attachment
and otherwise proceed to extract if either there is no Content-Disposition:
or the Content-Type:
is not either text/plain
or text/html
. Perhaps see also What are the "parts" in a multipart email?
Upvotes: 0