Reputation: 322
I'm looking to use Java to parse an ongoing stream of event drive XML generated by a remote device. Here's a simplified sample of two events:
<?xml version="1.0"?>
<Event> DeviceEventMsg
<Param1>SomeParmValue</Param1>
</Event>
<?xml version="1.0"?>
<Event> DeviceEventMsg
<Param1>SomeParmValue</Param1>
</Event>
It seems like SAX is more suited to this than DOM because it is an ongoing stream, though I'm not as familiar with Sax. Don't yell at me for the structure of the XML - I know it already and can't change it.
And yes the device DOES send the xml directive before every event. My first problem is that the second xml processing instruction is croaking the SAX parser.
Can anyone suggest a way to get around that?
The code I'm using so far which is croaking on the second xml processing instruction is:
public class TestMe extends HandlerBase {
public void startDocument () throws SAXException
{
System.out.println("got startDocument");
}
public void endDocument () throws SAXException
{
System.out.println("got endDocument");
}
public void startElement (String name, AttributeList attrs) throws SAXException
{
System.out.println("got startElement");
}
public void endElement (String name) throws SAXException
{
System.out.println("got endElement");
}
public void characters (char buf [], int offset, int len) throws SAXException
{
System.out.println("found characters");
}
public void processingInstruction (String target, String data) throws SAXException
{
System.out.println("got processingInstruction");
}
public static void main(String[] args) {
SAXParserFactory factory = SAXParserFactory.newInstance();
try {
SAXParser saxParser = factory.newSAXParser();
// using a file as test input for now
saxParser.parse( new File("devmodule.xml"), new TestMe() );
} catch (Throwable err) {
err.printStackTrace ();
}
}
}
Upvotes: 2
Views: 1742
Reputation: 116572
One more suggestion, specifically regarding multiple xml declarations. Yes, this is ILLEGAL xml, so proper parsers will barf on it using default modes. But some parsers have alternate "multi-document" modes. For example, Woodstox has this, so you can check out:
http://www.cowtowncoder.com/blog/archives/2008/04/entry_66.html
Basically, you have to tell parser (via input factory) that input is in form of "multiple xml documents" (ParsingMode.PARSING_MODE_DOCUMENTS).
If so, it will accept multiple xml declarations, each one indicating start of a new document.
Upvotes: 1
Reputation: 7144
RE: Simon's suggestion of catching the SAXException to determine when you've come to the end of one XML document and reached the start of another, I think this would be a problematic approach. If another error occurred(for whatever reason), you wouldn't be able to tell whether the exception had been thrown due to erroneous XML or because you'd reached the end of a document.
The problem is that the parser is for processing an XML document; not a stream of several XML documents. I would suggest writing some code to manually parse the incoming data stream, breaking it into individual streams containing a single XML document; and then pass these streams to the XML parser in serial (so guaranteeing the order of your events).
Upvotes: 0
Reputation: 5925
If you add this:
catch(SAXException SaxErr){
System.out.println("ignore this error");
}
before the other catch you will catch this particular error. you would then have to reopen the device or for the static file case you may have to keep track of were you are in the file.
Or at the end Event event, close the device/File and then reopen it for the next event.
Upvotes: 0
Reputation: 5925
If you print out the name for the start and end element System.out.println() you will get something like this:
got startDocument got startElement Event found characters found characters got startElement Param1 found characters got endElement Param1 found characters got endElement Event org.xml.sax.SAXParseException: The processing instruction target matching "[xX][mM][lL]" is not allowed. ...
So I think the second
<?xml version="1.0"?>
without getting an endDocument is causing a parser problem.
Upvotes: 0