V_Dev
V_Dev

Reputation: 93

Axiom parser entity issue

I am parsing a xml file using Axiom parser. If a xml element contains any html entity, axiom parser adds it at the beginning irrespective of its postion.

For Ex:.

<Root>
  <P> This element contains &alpha; html entity. </P>
</Root>

OMXMLParserWrapperObj.getDocumentElement() returns the following output.

<Root>
  <P>&alpha; This element contains html entity. </P>
</Root>

But output should be same as the input. Any inputs on how to solve this one ?

I am using the below code:

 try {
  InputStream in;
  OMElement rootOMElement;
  in = new FileInputStream(xmlFile);
  XMLStreamReader parser;

  StAXParserConfiguration standalone = StAXParserConfiguration.STANDALONE;
  parser = StAXUtils.createXMLStreamReader(standalone, in);

  OMXMLParserWrapper createStAXOMBuilder = OMXMLBuilderFactory.createStAXOMBuilder(parser);
  rootOMElement = createStAXOMBuilder.getDocumentElement();
  in.close();
}
catch (XMLStreamException | IOException e) {
  Logger.getAnonymousLogger().log(Level.SEVERE, e.getStackTrace(), e);
}

Upvotes: 0

Views: 162

Answers (2)

Andreas Veithen
Andreas Veithen

Reputation: 9154

This is caused by a bug in the StAX parser in the JRE. When IS_COALESCING is enabled, it returns events in the wrong order. To work around this, build a new StAXParserConfiguration based on STANDALONE that also disables coalescing:

new StAXParserConfiguration() {
    public XMLInputFactory configure(XMLInputFactory factory, StAXDialect dialect) {
        StAXParserConfiguration.STANDALONE.configure(factory, dialect);
        StAXParserConfiguration.NON_COALESCING.configure(factory, dialect);
        return factory;
    }

    public String toString() {
        return "STANDALONE_NON_COALESCING";
    }
}

Upvotes: 2

V_Dev
V_Dev

Reputation: 93

Issue is confirmed with Axiom team. It will be resolved in the next release. For reference:

https://issues.apache.org/jira/browse/AXIOM-490

Upvotes: 1

Related Questions