Rahul Vaidya
Rahul Vaidya

Reputation: 319

How to parse email body from outlook in python dataframe

My objective is to parse email body from outlook and store it in pandas dataframe thenusing regex get specific values from that dataframe and insert it using oracle database. i am done with regex and oracle script but not able to add outlook emails as dataframe. Can anyone please correct me ? Below is the script

import win32com.client
import pandas as pd
from bs4 import BeautifulSoup
from pprint import pprint
from datetime import datetime, timedelta

outlook = win32com.client.gencache.EnsureDispatch("Outlook.Application")
mapi = outlook.GetNamespace("MAPI")
inbox = mapi.Folders['[email protected]'].Folders['Inbox'].Folders['Important']
Mail_Messages = inbox.Items
Mail_Messages = Mail_Messages.Restrict("[Subject] = 'SGPSBSH Index Level*'")
received_dt = datetime.now() - timedelta(days=1)

for mail in Mail_Messages:
    receivedtime = mail.ReceivedTime.strftime('%Y-%m-%d %H:%M:%S')
    body = mail.HTMLBody
    html_body = BeautifulSoup(body, "lxml")
   print(Mail_Messages.body)

SAMPLE EMAIL

enter image description here

Upvotes: 1

Views: 1912

Answers (1)

Eugene Astafiev
Eugene Astafiev

Reputation: 49405

First of all, I've noticed the following line of code:

inbox = mapi.Folders['[email protected]'].Folders['Inbox'].Folders['Important']

Use the NameSpace.GetDefaultFolder method which returns a Folder object that represents the default folder of the requested type for the current profile; for example, obtains the default Inbox folder for the user who is currently logged on.

If you need to get the Inbox folder for a specific store in Outlook you may consider using the Store.GetDefaultFolder method instead. This method is similar to the GetDefaultFolder method of the NameSpace object. The difference is that this method gets the default folder on the delivery store that is associated with the account, whereas NameSpace.GetDefaultFolder returns the default folder on the default store for the current profile.

The Outlook object model supports three main ways of dealing with the message bodies:

  1. The Body property returns or sets a string representing the clear-text body of the Outlook item.

  2. The HTMLBody property of the MailItem class returns or sets a string representing the HTML body of the specified item. Setting the HTMLBody property will always update the Body property immediately. For example:

     Sub CreateHTMLMail() 
       'Creates a new e-mail item and modifies its properties. 
       Dim objMail As Outlook.MailItem 
       'Create e-mail item 
       Set objMail = Application.CreateItem(olMailItem) 
       With objMail 
        'Set body format to HTML 
        .BodyFormat = olFormatHTML 
        .HTMLBody = "<HTML><BODY>Enter the message <a href="http://google.com">text</a> here. </BODY></HTML>" 
        .Display 
       End With 
     End Sub
    
  3. The Word object model can be used for dealing with message bodies. See Chapter 17: Working with Item Bodies for more information.

It is up to you which way is to choose.

Upvotes: 1

Related Questions