Shawn D.
Shawn D.

Reputation: 8125

Are binary characters legal in MIME headers?

I work on a server that processes email, and as part of that, we do some MIME parsing/encoding. I've recently had an issue arise for a message that is valid otherwise, but contains a Latin-1 character in a MIME header. Someone entered an e-mail address to multiple recipients containing a Latin-1 character, so the SMTP envelope only contains the valid recipients, but the To line still contains the invalid address and improperly-encoded string.

It was my impression that this is illegal, and that MIME headers are required to be 7-bit. 8-bit values in MIME headers have to be encoded in the form

=?charset?encoding?encoded text?=

The header in question is something like this:

To: <changé[email protected]>, <[email protected]>

My question is: Is this valid MIME and I just don't know about it?

Upvotes: 1

Views: 1080

Answers (3)

james.garriss
james.garriss

Reputation: 13397

Email addresses like

changé[email protected]

are perfectly legal if the characters are encoded in UTF-8 and if the server supports SMTPUTF8, an extension to SMTP. The server advertises support by responding to EHLO with the SMTPUTF8 keyword:

250-SMTPUTF8

The client utilizes the extension by adding the SMTPUTF8 parameter on the MAIL command:

MAIL FROM:<changé[email protected]> SMTPUTF8

Sadly, there is very little support for this extension at this time.

See RFC 6531 for more info: https://www.rfc-editor.org/rfc/rfc6531

Upvotes: 0

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798636

From RFC2822, Internet Message Format, section 2.2, Header Fields:

Header fields are lines composed of a field name, followed by a colon (":"), followed by a field body, and terminated by CRLF. A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon. A field body may be composed of any US-ASCII characters, except for CR and LF. However, a field body may contain CRLF when used in header "folding" and "unfolding" as described in section 2.2.3. All field bodies MUST conform to the syntax described in sections 3 and 4 of this standard.

Therefore, any non-ASCII characters are illegal.

Upvotes: 3

Andrey
Andrey

Reputation: 60065

RFC 822 says:

 address     =  mailbox                      ; one addressee
 mailbox     =  addr-spec                    ; simple address
 addr-spec   =  local-part "@" domain        ; global address
 local-part  =  word *("." word)             ; uninterpreted
 word        =  atom / quoted-string     
 atom        =  1*<any CHAR except specials, SPACE and CTLs>
 CHAR        =  <any ASCII character>        ; (  0-177,  0.-127.)

got it? your option is "quoted-string" - =?charset?encoding?encoded text?=

Upvotes: 1

Related Questions