Reputation: 16205
I am trying to hash out how I would create an e-mail parser. I understand technically how to do it, but I cannot figure out implementation details.
So, user sends an e-mail to an address, mail server receives and my app parses it based upon subject, content and drops it in a bucket (e-mail account or database) and then I can act upon it.
So do I use an existing mail server software (like Zimbra, which we already have running) or do I create an app that listens on port 25 and does specifically what I need? (meaning no mail server sofware running on this box, etc)
My goal here is to create myself a series of organization tools for personal use in an automated way based upon what I e-mail myself.
Upvotes: 1
Views: 1018
Reputation: 5201
Writing something to listen on port 25 and act as an SMTP server will be involved and probably overkill for what you want.
I think there are two main options. The first is to leave your existing mail server in place and then poll an account on that mail server over IMAP (or POP3) to retrieve the emails and then process them using a script. It really doesn't matter what language you're comfortable with as there are libraries for handling IMAP connections and then parsing the email in most languages.
Alternatively you could look at a service like http://CloudMailin.com that does this for you. It will receive the email and send it to a web app that you could create via an http post in something like JSON format.
Upvotes: 1
Reputation: 1407
I would go for a python script which polls the mailbox (basing on a cron job). Python allows you to access IMAP very easily and has powerful regular expression functions to parse the email content.
Try something like:
import imaplib, email
import re
M= imaplib.IMAP4_SSL('imap.gmail.com')
M.login('user', 'pass')
M.select('Imap_folder')
typ, data = M.search(None, 'FROM', '"*"')
for num in data[0].split():
typ, data = M.fetch(num, '(RFC822)')
email_body = data[0][1] # getting the mail content
mail = email.message_from_string(email_body) # parsing the mail content to get a mail object
foo = re.compile("your regular expr here", re.MULTILINE)
res = foo.search(email_body)
Upvotes: 0