Reputation: 51807
Say I have an email as sample.eml
and I would like to get a list of all the recipients on that email. Let's say it looks like this:
From: [email protected]
To: Person Man <[email protected]>, Fredrick Douglas <[email protected]>
Cc: Guido <[email protected]>, FLUFL <[email protected]>
Bcc: [email protected], The Dude <[email protected]>
Subject: Testing email
This isn't a very fancy email, but I'm just trying to prove a point here, OK?
I can stick this in a Python script and parse the email:
from email.parser import BytesParser
from itertools import chain
msg = b'''
From: [email protected]
To: Person Man <[email protected]>, Fredrick Douglas <[email protected]>
Cc: Guido <[email protected]>, FLUFL <[email protected]>
Bcc: [email protected], The Dude <[email protected]>
Subject: Testing email
This isn't a very fancy email, but I'm just trying to prove a point here, OK?
'''.strip()
email = BytesParser().parsebytes(msg)
for recipient in chain(email.get_all('to'), email.get_all('cc'), email.get_all('bcc')):
print('Recipient is:', repr(recipient))
I would expect to see something like:
Recipient is: 'Person Man <[email protected]>'
Recipient is: 'Fredrick Douglas <[email protected]>'
Recipient is: 'Guido <[email protected]>'
Recipient is: 'FLUFL <[email protected]>'
Recipient is: '[email protected]'
Recipient is: 'The Dude <[email protected]>'
Instead, I get this:
Recipient is: 'Person Man <[email protected]>, Fredrick Douglas <[email protected]>'
Recipient is: 'Guido <[email protected]>, FLUFL <[email protected]>'
Recipient is: '[email protected], The Dude <[email protected]>'
Is there a better way to do this?
Upvotes: 3
Views: 2262
Reputation: 51807
The best way I've found so far involves email.utils
.
for recipient in getaddresses(
chain(email.get_all('to', []), email.get_all('cc', []), email.get_all('bcc', []))
):
print('The recipient is: ', recipient)
From the docs on getaddresses:
This method returns a list of 2-tuples of the form returned by parseaddr(). fieldvalues is a sequence of header field values as might be returned by Message.get_all.
get_all
will return None
if the header is absent, unless you pass in a default, so get_all('to', [])
is a good idea.
This message has the added advantage of properly parsing some very terrible, but entirely valid, email addresses:
msg = b"""
From: [email protected]
To: Person Man <[email protected]>, Fredrick Douglas <[email protected]>
Cc: Guido <[email protected]>, FLUFL <[email protected]> ,"Abc\@def"@example.com ,"Fred Bloggs"@example.com ,"Joe\\Blow"@example.com ,"Abc@def"@example.com ,customer/[email protected] ,\[email protected] ,!def!xyz%[email protected] ,[email protected], much."more\ unusual"@example.com, very.unusual."@"[email protected], very."(),:;<>[]".VERY."very@\\"very"[email protected]
Subject: Testing email
This isn't a very fancy email, but I'm just trying to prove a point here, OK?
""".strip()
Just splitting on ,
wouldn't correctly handle:
very."(),:;<>[]".VERY."very@\\"very"[email protected]
Which is an entirely valid email address.
Upvotes: 4