Reputation: 40908
email.utils.parseaddr
doesn't seem to be able to handle cases where the name is listed in lastname, firstname format (a format that is common in email metadata).
Example:
>>> import email.utils
>>> email.utils.parseaddr('Joe A. Smith <[email protected]>') # OK
('Joe A. Smith', '[email protected]')
>>> email.utils.parseaddr('Smith, Joe A. <[email protected]>') # Fails
('', 'Smith')
Is this intentionally designed? email
purports to follow RFC 2822. The spec for the full string is defined as
angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr
But's its unclear to me what can constitute "CFWS." Is the return type ('', 'Smith')
compliant with the RFC?
Version info:
>>> sys.version_info
sys.version_info(major=3, minor=6, micro=6, releaselevel='final', serial=0)
Upvotes: 0
Views: 323
Reputation: 57590
As defined in section 3.2.3 of the RFC, CFWS
is whitespace & comments, so it does not apply here. You want to look at the following definitions, scattered throughout the grammar:
name-addr = [display-name] angle-addr
display-name = phrase
phrase = 1*word / obs-phrase
word = atom / quoted-string
atom = [CFWS] 1*atext [CFWS]
atext = [a bunch of characters not including comma]
obs-phrase = word *(word / "." / CFWS)
From this, we can see that 'Joe A. Smith <[email protected]>'
is valid because Joe A. Smith
is an obs-phrase
, but 'Smith, Joe A. <[email protected]>'
is not valid because commas aren't allowed in an atom
or obs-phrase
. Instead, you must use a quoted-string
:
>>> email.utils.parseaddr('"Smith, Joe A." <[email protected]>')
('Smith, Joe A.', '[email protected]')
Upvotes: 4