Reputation: 15
I am making a script that extracts particular data (Subject,Date,Sender) from an Outlook saved message (.msg extension) and I want to fill the data in a csv file one line at a time.
So the script should go through the folder's file with msg extension and extract data. This is what I could come up with until now.
This code creates the initial file but it copies the same data from the first read email X times instead of moving to the next.
import os
import glob
import csv
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
files = glob.glob('PATH_TO_FILES\\*.msg')
for file in files:
msg = outlook.OpenSharedItem(file)
#print(file)
#with open(file) as f:
#msg=f.read()
#print(msg)
with open(r'Email.csv', mode='w') as file:
fieldnames = ['Subject', 'Date', 'Sender']
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader()
#for f in os.listdir('.'):
for f in files:
#if not f.endswith('.msg'):
#continue
#msg = msg.Message(f)
msg_sender = msg.SenderName
msg_date = msg.SentOn
msg_subj = msg.Subject
#msg_message = msg.Body
writer.writerow({'Subject': msg_subj, 'Date': msg_date, 'Sender': msg_sender})
Upvotes: 1
Views: 936
Reputation: 149175
It is a rather vicious mistake...
Just look at your structure:
for file in files:
msg = outlook.OpenSharedItem(file)
with open(r'Email.csv', mode='w') as file:
for f in files:
# process msg
and follow what happens:
'w'
mode erasing any previous dataSo you have 2 levels of loop over the msg files, and each iteration of the outer one resets the csv file. In the end, only the last one matters and processes n times the same last file.
How to fix: just loop once over the files, after opening the csv file:
with open(r'Email.csv', mode='w') as file:
for f in files:
msg = outlook.OpenSharedItem(f)
# process msg
Upvotes: 1