Reputation: 1347
I have to transform XML to XHTML but the XML defines a namespace xmlns='http://www.lotus.com/dxl'
which is never used in the whole XML therefore the parser won't parse anything ...
Is there a way I ignore namepsaces? I am using the Oracle java transformer import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory
Or are there any better libraries?
Upvotes: 4
Views: 7273
Reputation: 31
You can't ignore namespaces easily, and it won't be pretty, but it is possible. Of course, tricking the right part inside the Transformer
implementation into just outputting the prefixes without getting flustered is implementation dependent!
OK then, this works for me going from a Node
to a StringWriter
:
public static String nodeToString(Node node) throws TransformerException {
StringWriter results = new StringWriter();
Transformer transformer = createTransformer();
transformer.transform(new DOMSource(node), new StreamResult(results) {
@Override
public Writer getWriter() {
Field field = findFirstAssignable(transformer.getClass());
try {
field.setAccessible(true);
field.set(transformer, new TransletOutputHandlerFactory(false) {
@Override
public SerializationHandler getSerializationHandler() throws
IOException, ParserConfigurationException {
SerializationHandler handler = super.getSerializationHandler();
SerializerBase base = (SerializerBase) handler.asDOMSerializer();
base.setNamespaceMappings(new NamespaceMappings() {
@Override
public String lookupNamespace(String prefix) {
return prefix;
}
});
return handler;
}
});
} catch(IllegalAccessException e) {
throw new AssertionError("Must not happen", e);
}
return super.getWriter();
}
});
return results.toString();
}
private static <E> Field findFirstAssignable(Class<E> clazz) {
return Stream.<Class<? super E>>iterate(clazz, Convert::iteration)
.flatMap(Convert::classToFields)
.filter(Convert::canAssign).findFirst().get();
}
private static <E> Class<? super E> iteration(Class<? super E> c) {
return c == null ? null : c.getSuperclass();
}
private static boolean canAssign(Field f) {
return f == null ||
f.getType().isAssignableFrom(TransletOutputHandlerFactory.class);
}
private static <E> Stream<Field> classToFields(Class<? super E> c) {
return c == null ? Stream.of((Field) null) :
Arrays.stream(c.getDeclaredFields());
}
What this is doing is pretty much just customizing the mapping of namespaces to prefixes. Every prefix is mapped to a namespace identified by its prefix, so there shouldn't even be any conflicts. The rest of it is fighting the API.
To make the example complete, here are the methods converting to and from the XML as well:
public static Transformer createTransformer()
throws TransformerFactoryConfigurationError,
TransformerConfigurationException {
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.INDENT, "no");
return transformer;
}
public static ArrayList<Node> parseNodes(String uri, String expression)
throws ParserConfigurationException, SAXException,
IOException,XPathExpressionException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(uri);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(expression);
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
ArrayList<Node> nodes = new ArrayList<>();
for(int i = 0; i < nl.getLength(); i++) {
nodes.add(nl.item(i));
}
return nodes;
}
Upvotes: 3
Reputation: 163262
No, you can't ignore namespaces.
If the namespace declaration xmlns='http://www.lotus.com/dxl' appears in the outermost element, then you can't say it "isn't used anywhere" - on the contrary, it is used everywhere! It effectively changes every element name in the document to a different name. There's no way you can ignore that.
If you were using XSLT 2.0, then you would be able to say in your stylesheet xpath-default-namespace="http://www.lotus.com/dxl"
which would pretty much do what you want: it says that any unprefixed name in a match pattern or XPath expression should be interpreted as referring to a name in namespace http://www.lotus.com/dxl. Sadly, you've chosen an XSLT processor that doesn't implement XSLT 2.0. So you'll have to do it the hard way (which is described in about 10,000 posts that you will find by searching for "XSLT default namespace").
Upvotes: 4