Reputation: 8125
I work on a server that processes email, and as part of that, we do some MIME parsing/encoding. I've recently had an issue arise for a message that is valid otherwise, but contains a Latin-1 character in a MIME header. Someone entered an e-mail address to multiple recipients containing a Latin-1 character, so the SMTP envelope only contains the valid recipients, but the To line still contains the invalid address and improperly-encoded string.
It was my impression that this is illegal, and that MIME headers are required to be 7-bit. 8-bit values in MIME headers have to be encoded in the form
=?charset?encoding?encoded text?=
The header in question is something like this:
To: <changé[email protected]>, <[email protected]>
My question is: Is this valid MIME and I just don't know about it?
Upvotes: 1
Views: 1080
Reputation: 13397
Email addresses like
changé[email protected]
are perfectly legal if the characters are encoded in UTF-8 and if the server supports SMTPUTF8, an extension to SMTP. The server advertises support by responding to EHLO with the SMTPUTF8 keyword:
250-SMTPUTF8
The client utilizes the extension by adding the SMTPUTF8 parameter on the MAIL command:
MAIL FROM:<changé[email protected]> SMTPUTF8
Sadly, there is very little support for this extension at this time.
See RFC 6531 for more info: https://www.rfc-editor.org/rfc/rfc6531
Upvotes: 0
Reputation: 798636
From RFC2822, Internet Message Format, section 2.2, Header Fields:
Header fields are lines composed of a field name, followed by a colon (":"), followed by a field body, and terminated by CRLF. A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon. A field body may be composed of any US-ASCII characters, except for CR and LF. However, a field body may contain CRLF when used in header "folding" and "unfolding" as described in section 2.2.3. All field bodies MUST conform to the syntax described in sections 3 and 4 of this standard.
Therefore, any non-ASCII characters are illegal.
Upvotes: 3
Reputation: 60065
address = mailbox ; one addressee
mailbox = addr-spec ; simple address
addr-spec = local-part "@" domain ; global address
local-part = word *("." word) ; uninterpreted
word = atom / quoted-string
atom = 1*<any CHAR except specials, SPACE and CTLs>
CHAR = <any ASCII character> ; ( 0-177, 0.-127.)
got it? your option is "quoted-string" - =?charset?encoding?encoded text?=
Upvotes: 1