Reputation: 7331
I am building a parser for VAST documents, which are XML documents for which there is an official XSD, with a couple of versions : https://github.com/InteractiveAdvertisingBureau/vast/tree/master
I need to be able to unmarshal incoming XML, so I have generated the model using jaxb2-maven-plugin
.
I need to be able to process incoming XML that may or may not mention the namespace : my problem is that it works when there's a namespace, but it doesn't when there's none.
Following https://stackoverflow.com/a/8717287/3067542 and https://docs.oracle.com/javase/6/docs/api/javax/xml/bind/Unmarshaller.html#unmarshalByDeclaredType , I understand that there's a workaround, because I know the target class type, so I can force to unmarshall to that class, whether there's a namespace or not.
Here's my code (also available here on github )
JAXBContext jc = JAXBContext.newInstance(VAST.class);
Unmarshaller u = jc.createUnmarshaller();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(new StringReader(xmlString)));
JAXBElement<VAST> foo = u.unmarshal( doc, VAST.class);
return new CustomVast(foo.getValue());
When running the test, I see that the inner classes are not populated :
Am I missing something ? is there an additional flag to set when generating the classes with jaxb2-maven-plugin
so that it will work ?
Upvotes: 0
Views: 71
Reputation: 7331
the hack I've found is to preprocess the incoming String in a very basic (naive ?) way..
public class XMLParserHelper {
public final static Pattern VAST_ROOT_TAG_PATTERN = Pattern.compile("<VAST (.*)>");
private final static String EXPECTED_NAMESPACE = "xmlns=\"http://www.iab.com/VAST\"";
/**
* If the incoming XML document doesn't have a namespace, we add it.
*
* @param xmlString the XML content to process
* @return if the expected namespace is present, it will return the same String that was given in input. If the namespace is not present, it will return the same String given in input, <b>with a namespace added in the root element</b>.
* @throws ElementNotFoundException if the root tag is not found (ie it's not a VAST document)
*/
public String preprocessIfRequired(String xmlString) throws ElementNotFoundException {
String vastRootTag = extractVastRootTag(xmlString);
if (vastRootTag.contains(EXPECTED_NAMESPACE)) {
//namespace is present, we can process as is
return xmlString;
} else {
String newRootTag = vastRootTag.replace(">", " " + EXPECTED_NAMESPACE + ">");
return xmlString.replace(vastRootTag, newRootTag);
}
}
private String extractVastRootTag(String xmlString) throws ElementNotFoundException {
Matcher matcher = VAST_ROOT_TAG_PATTERN.matcher(xmlString);
if (!matcher.find()) {
throw new ElementNotFoundException("root VAST tag not found in doc :\n" + xmlString);
}
return matcher.group();
}
}
it works, but I can't say I am happy with that.. I hope there's another way, without pre-processing
Upvotes: 0
Reputation: 2838
This answer is clearly unoptimize but will give you hints on how to get it work on both namespaced-unnamespaced XML for 4.2 version :
Here is the body method of parseXml
JAXBContext jc = JAXBContext.newInstance(VAST.class);
Unmarshaller u = jc.createUnmarshaller();
// should be optimized
TransformerFactory tf = TransformerFactory.newInstance();
StringWriter sw = new StringWriter();
URL urlXslt = VastParser.class.getClassLoader().getResource("xslt/vast_4.2.xslt");
File fileXslt = new File(urlXslt.toURI());
Transformer t = tf.newTransformer(new StreamSource(new FileInputStream(fileXslt)));
// transform original XML with XSLT to always add the namespace in the parsing
t.transform(new StreamSource(new StringReader(xmlString)), new StreamResult(sw));
// unmarshall transformed XML
JAXBElement<VAST> foo = u.unmarshal(new StreamSource(new StringReader(sw.toString())), VAST.class);
return new CustomVast(foo.getValue());
The src/main/resources/xslt/vast_4.2.xslt
is :
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="@*|text()|comment()|processing-instruction()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- adds the xmlns part to the VAST element -->
<xsl:template match="/VAST">
<VAST xmlns="http://www.iab.com/VAST">
<xsl:apply-templates select="@*|node()"/>
</VAST>
</xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
With that, both unit-tests are working for 4.2 part.
Upvotes: 1