Reputation:
I am trying to explore the enron email dataset using python Jupyter notebook. But I am getting this attribute error. I am trying to read the emails and convert them into csv format so that I can further apply Ml for sentiment analysis. import tarfile import re from datetime import datetime from collections import namedtuple, Counter import pandas as pd import altair as alt
tar =tarfile.open(r"C:\Users\nikip\Documents\2021\Interview Preparation\sentiment analysis\enron_mail_20150507.tar.gz", "r")
items = tar.getmembers()
Email = namedtuple('Email', 'Date, From, To, Subject, Cc, Bcc, Message')
def get_msg(item_number):
f = tar.extractfile(items[item_number])
try:
date = from_ = to = subject = cc= bcc = message= ''
in_to = False
in_message = False
to = []
message = []
item = f.read().decode()
item = item.replace('\r', '').replace('\t', '')
lines = item.split('\n')
for num, line in enumerate(lines):
if line.startswith('Date:') and not date:
date = datetime.strptime(' '.join(line.split('Date: ')[1].split()[:-2]), '%a, %d %b %Y %H:%M:%S')
elif line.startswith('From:') and not from_:
from_ = line.replace('From:', '').strip()
elif line.startswith('To:')and not to:
in_to = True
to = line.replace('To:', '').replace(',', '').replace(',', '').split()
elif line.startswith('Subject:') and not subject:
in_to = False
subject = line.replace('Subject:', '').strip()
elif line.startswith('Cc:') and not cc:
cc = line.replace('Cc:', '').replace(',', '').replace(',', '').split()
elif line.startswith('Bcc:') and not bcc:
bcc = line.replace('Bcc:', '').replace(',', '').replace(',', '').split()
elif in_to:
to.extend(line.replace(',', '').split())
elif line.statswith('Subject:') and not subject:
in_to =False
elif line.startswith('X-FileName'):
in_message = True
elif in_message:
message.append(line)
to = '; '.join(to).strip()
cc = '; '.join(cc).strip()
bcc = '; '.join(bcc).strip()
message = ' '.join(message).strip()
email = Email(date, from_, to, subject, cc, bcc, message)
return email
except Exception as e:
return e
msg = get_msg(3002)
msg.date
I am getting error message like below:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-11-e1439579a8e7> in <module>
----> 1 msg.To
AttributeError: 'AttributeError' object has no attribute 'To'
Can someone help ?thanks in advance
Upvotes: 0
Views: 479
Reputation: 311968
The problem is that you are return an exception in your get_msg
function, which broadly looks like this:
def get_msg(item_number):
try:
...do some stuff...
except Exception as e:
return e
It looks like you're triggering an AttributeError
exception somewhere in your code, and you're returning that exception, rather than an Email
object.
You almost never want to have an except
statement that suppresses all exceptions like that, because it will hide errors in your code (as we see here). It is generally better practice to catch specific exceptions, or at least log the error if your code will continue despite the exception.
As a first step, I would suggest removing the entire try/except
block and get your code working without it.
Upvotes: 1