Saadiq
Saadiq

Reputation: 101

Downloading Email Attachments from Shared Folder - Python

I have the below code to download email attachments based on date sent and email subject criteria:

from datetime import date, timedelta
import os
import win32com.client


path = os.path.expanduser("C:\\Users\\xxxx\\Documents\\Projects\\VBA Projects\\VLOOKUP Automation\\Vlookup File Location")
today = date.today()

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.Folders("xxx").Folders.Item("Inbox")
messages = inbox.Items
subject = "xxx"

dateHigh = date.today() - timedelta(days=1)
dateLow = date.today() - timedelta(days=-1)

max = 2500
for count, message in enumerate(messages):
    if count > max:
        break
    if subject in message.subject and message.senton.date() > dateLow and message.senton.date() < dateHigh:
            attachments = message.Attachments
            num_attach = len([x for x in attachments])
            for x in range(1, num_attach+1):
                attachment = attachments.Item(x)
                attachment.SaveASFile(path + '\\' + str(attachment))

Is there any way to specify criteria for only .csv attachments to be downloaded for example?

Additionally, this code was previously being used on a public folder - those folders have now been updated to shared folders. Since the update, I have had to increase the "max" from 500 to 2500 in order to find the specified emails. Is there any way to speed this up?

Thanks

Upvotes: 4

Views: 1409

Answers (2)

Jin Thakur
Jin Thakur

Reputation: 2773

I think this is part of requirement to download csv only. This outlook component has some methods which you can utilize. Instead of messages = inbox.Items try messages = inbox.Items.GetFirst() and get first message then use

messages = inbox.Items.oItems.GetNext() so in this way you always have one message in memory and you can keep looping for longer time.

Make sure you have outlook Microsoft Outlook 16.0 Object Library or higher than 10 so that this method exists. GetFirst() c# code used by me

Outlook.MailItem oMsg = (Outlook.MailItem)oItems.GetFirst();

                    //Output some common properties.
                    Console.WriteLine(oMsg.Subject);
                    Console.WriteLine(oMsg.SenderName);
                    Console.WriteLine(oMsg.ReceivedTime);
                    Console.WriteLine(oMsg.Body);

                    //Check for attachments.
                    int AttachCnt = oMsg.Attachments.Count;
                    Console.WriteLine("Attachments: " + AttachCnt.ToString());
                Outlook.MailItem oMsg1 = (Outlook.MailItem)oItems.GetNext();

Upvotes: 0

alexisdevarennes
alexisdevarennes

Reputation: 5632

Below is a way to specify which file types you want.

Please enter the file endings in the attachments_of_interest list.

from datetime import date, timedelta
import os
import win32com.client


path = os.path.expanduser("C:\\Users\\xxxx\\Documents\\Projects\\VBA Projects\\VLOOKUP Automation\\Vlookup File Location")
today = date.today()

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.Folders("xxx").Folders.Item("Inbox")
messages = inbox.Items
subject = "xxx"

dateHigh = date.today() - timedelta(days=1)
dateLow = date.today() - timedelta(days=-1)

max_n = 2500
attachments_of_interest = ['.csv']

for count, message in enumerate(messages):
    if count > max_n:
        break
    if subject in message.subject and message.senton.date() > dateLow and message.senton.date() < dateHigh:
        attachments = message.Attachments
        num_attach = len([x for x in attachments])
        for x in range(1, num_attach+1):
            attachment = attachments.Item(x)
            attachment_fname = str(attachment)
            file_ending = attachment_fname.split('.')[-1]
            if not attachments_of_interest or file_ending in attachments_of_interest:
                attachment.SaveASFile(path + '\\' + attachment_fname)

As for speeding up, you could use a pool:

from multiprocessing.pool import ThreadPool as Pool
from datetime import date, timedelta
import os
import win32com.client


path = os.path.expanduser("C:\\Users\\xxxx\\Documents\\Projects\\VBA Projects\\VLOOKUP Automation\\Vlookup File Location")
today = date.today()

outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.Folders("xxx").Folders.Item("Inbox")
messages = inbox.Items
subject = "xxx"

max_n = 2500
attachments_of_interest = ['.csv']
pool_size = 5

# define worker function before a Pool is instantiated
def worker(message):
    dateHigh = date.today() - timedelta(days=1)
    dateLow = date.today() - timedelta(days=-1)
    if subject in message.subject and message.senton.date() > dateLow and message.senton.date() < dateHigh:
        attachments = message.Attachments
        num_attach = len([x for x in attachments])
        for x in range(1, num_attach+1):
            attachment = attachments.Item(x)
            attachment_fname = str(attachment)
            file_ending = attachment_fname.split('.')[-1]
            if not attachments_of_interest or file_ending in attachments_of_interest:
                attachment.SaveASFile(path + '\\' + attachment_fname)

pool = Pool(pool_size)

for count, message in enumerate(messages):
    if count > max_n:
        break
    pool.apply_async(worker, (message,))

pool.close()
pool.join()

Upvotes: 2

Related Questions