Reputation: 31
I want to get one table that are inside the body of one .msg file with Python. I can get the body content, but I need the table separated into dataframe, for example.
I can get the body content, but I can't separe the table of the body
import win32com.client
import os
dir = r"C:\Users\Murilo\Desktop\Emails\030"
file_list = os.listdir(dir)
for file in file_list:
if file.endswith(".msg"):
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
msg = outlook.OpenSharedItem(dir + "/" + file)
print(msg.Body)
I need the table that exists in body content, but not all body
Upvotes: 2
Views: 7212
Reputation: 49405
The Outlook object model provides three main ways for working with item bodies:
See Chapter 17: Working with Item Bodies for more information.
But I think the easiest and cleanest way is to use the Word object model. You can read more how to deal with the Word Object Model and how to use it to extract the table content in the How to read contents of an Table in MS-Word file Using Python? post.
Upvotes: 0
Reputation: 66255
If it is an HTML table, use MailItem.HTMLBody
(instead of the plain text Body
) and extract the table from HTML.
Upvotes: 2
Reputation: 46
I would look at the extract_msg library. It should allow you to open a .msg file as plain XML and be very easy to extract a table from the content.
msg = extract_msg.Message(fileLoc)
msg_message = msg.body
content = ('Body: {}'.format(msg_message))
Upvotes: 0