Duke Dougal
Duke Dougal

Reputation: 26336

Python 3.4 email ContentManager - how to use?

I may get slammed because this question is too broad, but anyway I going to ask cause what else do I do? Digging through the Python source code should surely give me enough "good effort" points to warrant helping me?

I am trying to use Python 3.4's new email content manager http://docs.python.org/dev/library/email.contentmanager.html#content-manager-instances

It is my understanding that this should allow me to read an email message, then be able to access all the email header fields and body as UTF-8, without going through the painful process of decoding from whatever weird encoding back into clean UTF-8. I understand is also handles parsing of date headers and email address headers. Generally making life easier for reading emails in Python. Great stuff, very interesting.

However I am a beginner programmer - there are no examples in the current documentation of how to start from the start. I need a simple example showing how to read an email file and using the new email content manager, read back the header fields, address fields and body/

I have dug into the python 3.4 source code and looked at the tests for the email content manager. I will admit to being sufficiently amatuerish that I was too confused to be able to glean enough from the tests to start writing my own simple example.

So, is anyone willing to help with a simple example of how to use the Python 3.4 email content manager to read the header fields and body and address fields of an email?

thanks

Upvotes: 8

Views: 2418

Answers (2)

cfi
cfi

Reputation: 11300

If you have an email in a file and want to read it into Python, it's the email.Parser you should probably look at first. Like Brandon, I don't quite see the need for using the contentmanager, but maybe your question is too broad and you need to help me understand it better.

Code could look like:

filename = 'your_file_here.email.txt'

import email.parser
with open(filename, 'r') as fh:
  message = email.parser.Parser().parse(fh)

There are even convenience functions, and the one for your case would be:

import email
message = email.message_from_file('your_file_here.email.txt')

Then check the docs on email.message to see how to access the message's content. You can check with is_multipart() if it's a single monolithic block of text, or a MIME message consisting of multiple parts. In the latter case, there's walk() to iterate over each part.

Upvotes: 0

Brandon Rhodes
Brandon Rhodes

Reputation: 89454

First: the “address fields” in an email are in fact simply headers whose names have been agreed upon in standards, like To and From. So all you need are the email headers and body and you are done.

Given a modern contentmanager-powered EmailMessage instance such as Python 3.4 returns if you specify a policy (like default) when reading in an email message, you can access its auto-decoded headers by treating it like a Python dictionary, and its body with the get_body() call. Here is an example script I wrote that does both maneuvers in a safe and standard way:

https://github.com/brandon-rhodes/fopnp/blob/m/py3/chapter12/display_email.py

Behind the scenes, the policy is what is really in charge of what happens to both headers and content — with the default policy automatically subjecting headers to the encoding and decoding functions in email.utils, and content to the logic you asked about that is inside of contentmanager.

But as the caller you usually will not need to know the behind-the-scenes magic, because headers will “just work” and content can be easily accessed through the methods illustrated in the above script.

Upvotes: 4

Related Questions