Reputation: 7443
I'm building a web service that emits streaming XML. So, the output will look (at a high level) like this:
<fragment1>
<!-- ... -->
</fragment1>
<fragment2>
<!-- ... -->
</fragment2>
...and so on. For a normal XML document, you'd use any one of these different MIME types:
However, those MIME types all assume that the response contains exactly one XML document/fragment. In my case, the response contains zero or more fragments. For this reason, it seems like The Wrong Thing to use one of those MIME types. A correct handler would (correctly) handle the response as a single XML document and either (a) barf upon arriving at the second fragment, or (b) silently ignore fragments starting at fragment 2.
If that's The Wrong Thing, is one of these MIME types The Right Thing:
application/octet-stream
application/vnd.mycompany.com.description.streaming+xml
application/vnd.mycompany.com.description+streaming-xml
Or should I use a completely different one? Also, it would be great if the same "style" of MIME type could be applied to streaming JSON once that data format comes online.
EDIT: To give a little more flavor to the question and provide an example of a working implementation I'm trying to emulate, this API is modelled after the Twitter streaming API.
Upvotes: 1
Views: 2475
Reputation: 13197
There are multiple options depending on the semantic relations in the data, and their structure.
First option: if you have a (continuous) file that can be easily turned into a valid XML document by wrapping it in <elem>
…</elem>
tags, it should be application/xml-external-parsed-entity
. This can be anything from a simple text to comments, processing instructions or a list of complex elements. You cannot however insert the XML declaration (charset has to be defined via MIME) or any DTD (so the meaning has to be provided by the enclosing document if you rely on the DTD, and you also cannot include any other external parsed entities, unless you use XInclude).
I find this suitable for anything that can be described as arbitrary XML content/fragment. It is mostly intended to be used through external parsed entities in DTDs, but works equally well on its own. Use this if your fragments might not have a single root node. I can think of one caveat with this however: if the stream is infinite, the client will eventually have to terminate it somewhere, and since there is no external boundary specified, it may be terminated in the middle of an element, making it invalid according to its schema.
You may also use application/xml
and write the start tag yourself, but some parsers may wait for the end of the document if they are configured to process it as a whole. With application/xml-external-parsed-entity
, the best that can be done is to parse it as a stream of individual XML nodes.
Second option: there is the range of multipart
types. This way, you can wrap individual XML documents (application/xml
or specific) or fragments (application/xml-external-parsed-entity
). Again, the choice of the inner type depends on whether the individual messages may be treated as standalone XML documents (for example application/svg+xml
for "SVG video").
The choice of the subtype depends on the intended meaning of the whole sequence. A stream of grouped individual standalone files may use multipart/mixed
(this is the most general of types). If the XML data is interlinked in some way, you can use multipart/related
and assign identifiers to the individual fragments. And lastly, multipart/x-mixed-replace
is used if only the last part of the message represents the up-to-date content of the resource (to save individual requests).
For illustration:
If the response is a stream of text enriched with XHTML markup (converted from a Markdown stream for example), it should be a single application/xml-external-parsed-entity
.
If the fragments are attachments, files constantly downloaded from websites or uploaded by users, it should be multipart/mixed
.
If the fragments are nodes in a large or ever growing graph of resources (not just XML), multipart/related
should be used.
If the result is a short-lived information, like the current status of some process or continuous measurement of something, it should be multipart/x-mixed-replace
.
Upvotes: 0
Reputation: 22471
Sounds like apart from your streaming requirements, your content is actually a Multipart message with several application/xml
parts. With this layout application/json
parts could also be mixed in your message.
If your individual XML Fragments are part of larger documents take a look at the (somewhat old and understated) XML Fragment Interchange W3C Candidate Recommendation. It defines a nice syntax to wrap fragment bodies together with contextual information about the original document.
Upvotes: 3