Murilo Leite
Murilo Leite

Reputation: 31

How to get a table inside body of .msg file

I want to get one table that are inside the body of one .msg file with Python. I can get the body content, but I need the table separated into dataframe, for example.

I can get the body content, but I can't separe the table of the body

import win32com.client
import os

dir = r"C:\Users\Murilo\Desktop\Emails\030"

file_list = os.listdir(dir)

for file in file_list:
    if file.endswith(".msg"):
        outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
        msg = outlook.OpenSharedItem(dir + "/" + file)
        print(msg.Body)

I need the table that exists in body content, but not all body

Upvotes: 2

Views: 7212

Answers (3)

Eugene Astafiev
Eugene Astafiev

Reputation: 49405

The Outlook object model provides three main ways for working with item bodies:

  1. Body.
  2. HTMLBody.
  3. The Word editor. The WordEditor property of the Inspector class returns an instance of the Word Document which represents the message body. So, you can use the Word object model do whatever you need with the message body. The Copy and Paste methods of the Document will do the trick.

See Chapter 17: Working with Item Bodies for more information.

But I think the easiest and cleanest way is to use the Word object model. You can read more how to deal with the Word Object Model and how to use it to extract the table content in the How to read contents of an Table in MS-Word file Using Python? post.

Upvotes: 0

Dmitry Streblechenko
Dmitry Streblechenko

Reputation: 66255

If it is an HTML table, use MailItem.HTMLBody (instead of the plain text Body) and extract the table from HTML.

Upvotes: 2

toothflower
toothflower

Reputation: 46

I would look at the extract_msg library. It should allow you to open a .msg file as plain XML and be very easy to extract a table from the content.

msg = extract_msg.Message(fileLoc)
    msg_message = msg.body

    content = ('Body: {}'.format(msg_message))

Upvotes: 0

Related Questions