Jasmine
Jasmine

Reputation: 16205

Creating an e-mail parser as a service?

I am trying to hash out how I would create an e-mail parser. I understand technically how to do it, but I cannot figure out implementation details.

So, user sends an e-mail to an address, mail server receives and my app parses it based upon subject, content and drops it in a bucket (e-mail account or database) and then I can act upon it.

So do I use an existing mail server software (like Zimbra, which we already have running) or do I create an app that listens on port 25 and does specifically what I need? (meaning no mail server sofware running on this box, etc)

My goal here is to create myself a series of organization tools for personal use in an automated way based upon what I e-mail myself.

Upvotes: 1

Views: 1018

Answers (2)

Steve Smith
Steve Smith

Reputation: 5201

Writing something to listen on port 25 and act as an SMTP server will be involved and probably overkill for what you want.

I think there are two main options. The first is to leave your existing mail server in place and then poll an account on that mail server over IMAP (or POP3) to retrieve the emails and then process them using a script. It really doesn't matter what language you're comfortable with as there are libraries for handling IMAP connections and then parsing the email in most languages.

Alternatively you could look at a service like http://CloudMailin.com that does this for you. It will receive the email and send it to a web app that you could create via an http post in something like JSON format.

Upvotes: 1

Davide Vernizzi
Davide Vernizzi

Reputation: 1407

I would go for a python script which polls the mailbox (basing on a cron job). Python allows you to access IMAP very easily and has powerful regular expression functions to parse the email content.

Try something like:

import imaplib, email
import re

M= imaplib.IMAP4_SSL('imap.gmail.com')
M.login('user', 'pass')

M.select('Imap_folder')

typ, data = M.search(None, 'FROM', '"*"')

for num in data[0].split():
   typ, data = M.fetch(num, '(RFC822)')
   email_body = data[0][1] # getting the mail content
   mail = email.message_from_string(email_body) # parsing the mail content to get a mail object
   foo = re.compile("your regular expr here", re.MULTILINE)
   res = foo.search(email_body)

Upvotes: 0

Related Questions