yannisf
yannisf

Reputation: 6126

Parse a list of XML fragments with no root element from a stream input

Is it feasible in Java using the SAX api to parse a list of XML fragments with no root element from a stream input?

I tried parsing such an XML but got a

org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.

before even the endDocument event was fired.

I would like not to settle with obvious but clumsy solutions as "Pre-append a custom root element or Use buffered fragment parsing".

I am using the standard SAX API of Java 1.6. The SAX factory had setValidating(false) in case anyone wondered.

Upvotes: 9

Views: 3714

Answers (2)

phreed
phreed

Reputation: 1859

It sounds like you are working with XMPP. If so, there are libraries for parsing streams of XML fragments. There is a draft for an XML Fragment Interchange specification which was published by the W3C. This draft aimed to provide guidance on how XML fragments could be standardized for interchange. This specification did not gain wide adoption and is not actively maintained.

Upvotes: 0

npe
npe

Reputation: 15699

First, and most important of all, the content you are parsing is not an XML document. From the XML Specification:

[Definition: There is exactly one element, called the root, or document element, no part of which appears in the content of any other element.]

Now, as to parsing this with SAX - in spite of what you said about clumsiness - I'd suggest the following approach:

Enumeration<InputStream> streams = Collections.enumeration(
    Arrays.asList(new InputStream[] {
        new ByteArrayInputStream("<root>".getBytes()),
        yourXmlLikeStream,
        new ByteArrayInputStream("</root>".getBytes()),
    }));

SequenceInputStream seqStream = new SequenceInputStream(streams);

// Now pass the `seqStream` into the SAX parser.

Using the SequenceInputStream is a convenient way of concatenating multiple input streams into a single stream. They will be read in the order they are passed to the constructor (or in this case - returned by the Enumeration).

Pass it to your SAX parser, and you are done.

Upvotes: 13

Related Questions