Reputation: 93
I am parsing a xml file using Axiom parser. If a xml element contains any html entity, axiom parser adds it at the beginning irrespective of its postion.
For Ex:.
<Root>
<P> This element contains α html entity. </P>
</Root>
OMXMLParserWrapperObj.getDocumentElement() returns the following output.
<Root>
<P>α This element contains html entity. </P>
</Root>
But output should be same as the input. Any inputs on how to solve this one ?
I am using the below code:
try {
InputStream in;
OMElement rootOMElement;
in = new FileInputStream(xmlFile);
XMLStreamReader parser;
StAXParserConfiguration standalone = StAXParserConfiguration.STANDALONE;
parser = StAXUtils.createXMLStreamReader(standalone, in);
OMXMLParserWrapper createStAXOMBuilder = OMXMLBuilderFactory.createStAXOMBuilder(parser);
rootOMElement = createStAXOMBuilder.getDocumentElement();
in.close();
}
catch (XMLStreamException | IOException e) {
Logger.getAnonymousLogger().log(Level.SEVERE, e.getStackTrace(), e);
}
Upvotes: 0
Views: 162
Reputation: 9154
This is caused by a bug in the StAX parser in the JRE. When IS_COALESCING
is enabled, it returns events in the wrong order. To work around this, build a new StAXParserConfiguration
based on STANDALONE
that also disables coalescing:
new StAXParserConfiguration() {
public XMLInputFactory configure(XMLInputFactory factory, StAXDialect dialect) {
StAXParserConfiguration.STANDALONE.configure(factory, dialect);
StAXParserConfiguration.NON_COALESCING.configure(factory, dialect);
return factory;
}
public String toString() {
return "STANDALONE_NON_COALESCING";
}
}
Upvotes: 2
Reputation: 93
Issue is confirmed with Axiom team. It will be resolved in the next release. For reference:
https://issues.apache.org/jira/browse/AXIOM-490
Upvotes: 1