Reputation: 22055
I thought I should send "text/xml", but then I read that I should send "application/xml". Does it matter? Can someone explain the difference?
Upvotes: 146
Views: 162140
Reputation: 146
To complement what has been said above, even though the consensus shows that both text/xml
and application/xml
are acceptable, there is one key difference in terms of SEO:
text/xml
are indexable, because they are viewed by Googlebot as "text" documents, just like regular HTML pages, TXT files and PDF documents.application/xml
are not indexable, because they are not viewed as text documents and are instead "ignored" by Googlebot, at least as far as indexing goes.Given that this question concerns XML sitemap files, which is an SEO feature, it is worth noting that XML sitemaps served as text/xml
are indexed by Google, and therefore:
inurl:sitemap.xml
returns Google.com's own XML sitemap file
This means that your competitors will be able to find your XML sitemap by searching for it on Google (for example, see How to find sitemap.xml path on websites?), and further reverse-engineer your website and content strategy.x-robots-tag: noindex
HTTP header, GSC will warn you that your XML sitemap is excluded from the index by a noindex tag, which is usually what you want.Unfortunately, I have been unable to find an official source for this, and my answer is purely based on my own tests and findings. However, it is worth noting that the HTML5Boilerplate Server Configs for both nginx and Apache have set the default Content-Type for XML files to application/xml
, although the rationale for this choice does not appear to be documented.
Also worth noting, the same goes for JSON files: application/json
is the way to go if you don't want your JSON files indexed by Google!
Upvotes: 0
Reputation: 155024
Other answers here address the general question of what the proper Content-Type
for an XML response is, and conclude (as with What's the difference between text/xml vs application/xml for webservice response) that both text/xml
and application/xml
are permissible. However, none address whether there are any rules specific to sitemaps.
Answer: there aren't. The sitemap spec is https://www.sitemaps.org, and using Google site:
searches you can confirm that it does not contain the words or phrases mime, mimetype, content-type, application/xml, or text/xml anywhere. In other words, it is entirely silent on the topic of what Content-Type
should be used for serving sitemaps.
In the absence of any commentary in the sitemap spec directly addressing this question, we can safely assume that the same rules apply as when choosing the Content-Type
of any other XML document - i.e. that it may be either text/xml
or application/xml
.
Upvotes: 4
Reputation: 655649
The difference between text/xml and application/xml is the default character encoding if the charset parameter is omitted:
Text/xml and application/xml behave differently when the charset parameter is not explicitly specified. If the default charset (i.e., US-ASCII) for text/xml is inconvenient for some reason (e.g., bad web servers), application/xml provides an alternative (see "Optional parameters" of application/xml registration in Section 3.2).
For text/xml:
Conformant with [RFC2046], if a text/xml entity is received with the charset parameter omitted, MIME processors and XML processors MUST use the default charset value of "us-ascii"[ASCII]. In cases where the XML MIME entity is transmitted via HTTP, the default charset value is still "us-ascii".
For application/xml:
If an application/xml entity is received where the charset parameter is omitted, no information is being provided about the charset by the MIME Content-Type header. Conforming XML processors MUST follow the requirements in section 4.3.3 of [XML] that directly address this contingency. However, MIME processors that are not XML processors SHOULD NOT assume a default charset if the charset parameter is omitted from an application/xml entity.
So if the charset parameter is omitted, the character encoding of text/xml is US-ASCII while with application/xml the character encoding can be specified in the document itself.
Now a rule of thumb on the internet is: “Be strict with the output but be tolerant with the input.” That means make sure to meet the standards as much as possible when delivering data over the internet. But build in some mechanisms to overlook faults or to guess when receiving and interpreting data over the internet.
So in your case just pick one of the two types (I recommend application/xml) and make sure to specify the used character encoding properly (I recommend to use the respective default character encoding to play safe, so in case of application/xml use UTF-8 or UTF-16).
Upvotes: 182
Reputation: 977
As a rule of thumb, the safest bet towards making your document be treated properly by all web servers, proxies, and client browsers, is probably the following:
In terms of the RFC 3023 spec, which some browsers fail to implement properly, the major difference in the content types is in how clients are supposed to treat the character encoding, as follows:
For application/xml, application/xml-dtd, application/xml-external-parsed-entity, or any one of the subtypes of application/xml such as application/atom+xml, application/rss+xml or application/rdf+xml, the character encoding is determined in this order:
For text/xml, text/xml-external-parsed-entity, or a subtype like text/foo+xml, the encoding attribute of the XML declaration within the document is ignored, and the character encoding is:
Most parsers don't implement the spec; they ignore the HTTP Context-Type and just use the encoding in the document. With so many ill-formed documents out there, that's unlikely to change any time soon.
Upvotes: 27
Reputation: 8312
both are fine.
text/xxx means that in case the program does not understand xxx it makes sense to show the file to the user as plain text. application/xxx means that it is pointless to show it.
Please note that those content-types were originally defined for E-Mail attachment before they got later used in Web world.
Upvotes: 9
Reputation: 944202
text/xml is for documents that would be meaningful to a human if presented as text without further processing, application/xml is for everything else
Every XML entity is suitable for use with the application/xml media type without modification. But this does not exploit the fact that XML can be treated as plain text in many cases. MIME user agents (and web user agents) that do not have explicit support for application/xml will treat it as application/octet-stream, for example, by offering to save it to a file.
To indicate that an XML entity should be treated as plain text by default, use the text/xml media type. This restricts the encoding used in the XML entity to those that are compatible with the requirements for text media types as described in [RFC-2045] and [RFC-2046], e.g., UTF-8, but not UTF-16 (except for HTTP).
— http://www.ietf.org/rfc/rfc2376.txt
Upvotes: 7