Adrian Maire
Adrian Maire

Reputation: 14815

Is the charset component mandatory in the HTTP content-type header?

An HTTP request might have the Content-Type header:

GET / HTTP/1.1
...
Content-Type: text/xml; charset=utf-8
...

Is there circumstances where the charset component is mandatory? in case, when?

Example of possibles Content-Type headers, not necessarily correct:

Content-Type: text/xml
Content-Type: charset=utf-8
Content-Type: text/xml; charset=utf8
Content-Type:

Standard info:

EDIT NOTE: It seem this reference is obsolete, RFC 7231 is the correct version now, as suggested by @RobbyCornelissen.

The Standard say rather little about this (or maybe I am looking in the wrong place): https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

14.17 Content-Type

The Content-Type entity-header field indicates the media type of the entity-body sent to the recipient or, in the case of the HEAD method, the media type that would have been sent had the request been a GET.

   Content-Type   = "Content-Type" ":" media-type

Media types are defined in section 3.7. An example of the field is

   Content-Type: text/html; charset=ISO-8859-4

Further discussion of methods for identifying the media type of an entity is provided in section 7.2.1.

Upvotes: 6

Views: 3982

Answers (1)

CodeCaster
CodeCaster

Reputation: 151586

See RCF 7231, Appendix B. Changes from RFC 2616:

The default charset of ISO-8859-1 for text media types has been removed; the default is now whatever the media type definition says. Likewise, special treatment of ISO-8859-1 has been removed from the Accept-Charset header field. (Section 3.1.1.3 and Section 5.3.3)

So it depends on the default character set / encoding for the given media type. You can look up the media type registry with IANA, for example the application/xml media type, which links to RFC 7303 Section 3:

As many as three distinct sources of information about character encoding may be present for an XML MIME entity: a charset parameter, a BOM (see Section 3.3 below), and an XML encoding declaration (see Section 4.3.3 of [XML]). Ensuring consistency among these sources requires coordination between entity authors and MIME agents (that is, processes that package, transfer, deliver, and/or receive MIME entities).

The use of UTF-8, without a BOM, is RECOMMENDED for all XML MIME entities.

So no, it's not mandatory, but if omitted, it depends on the specific media type how you can detect it.

Upvotes: 6

Related Questions