Pali
Pali

Reputation: 1347

Java Transformer how to ignore namespaces

I have to transform XML to XHTML but the XML defines a namespace xmlns='http://www.lotus.com/dxl' which is never used in the whole XML therefore the parser won't parse anything ...

Is there a way I ignore namepsaces? I am using the Oracle java transformer import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory

Or are there any better libraries?

Upvotes: 4

Views: 7273

Answers (2)

Timo Kinnunen
Timo Kinnunen

Reputation: 31

You can't ignore namespaces easily, and it won't be pretty, but it is possible. Of course, tricking the right part inside the Transformer implementation into just outputting the prefixes without getting flustered is implementation dependent!

OK then, this works for me going from a Node to a StringWriter:

public static String nodeToString(Node node) throws TransformerException {
  StringWriter results = new StringWriter();
  Transformer transformer = createTransformer();
  transformer.transform(new DOMSource(node), new StreamResult(results) {
    @Override 
    public Writer getWriter() {
      Field field = findFirstAssignable(transformer.getClass());
      try {
        field.setAccessible(true);
        field.set(transformer, new TransletOutputHandlerFactory(false) {
          @Override 
          public SerializationHandler getSerializationHandler() throws 
            IOException, ParserConfigurationException {

            SerializationHandler handler = super.getSerializationHandler();
            SerializerBase base = (SerializerBase) handler.asDOMSerializer();
            base.setNamespaceMappings(new NamespaceMappings() {
              @Override 
              public String lookupNamespace(String prefix) {
                return prefix;
              }
            });
            return handler;
          }
        });
      } catch(IllegalAccessException e) {
        throw new AssertionError("Must not happen", e);
      }
      return super.getWriter();
    }
  });
  return results.toString();
}
private static <E> Field findFirstAssignable(Class<E> clazz) {
  return Stream.<Class<? super E>>iterate(clazz, Convert::iteration)
    .flatMap(Convert::classToFields)
    .filter(Convert::canAssign).findFirst().get();
}
private static <E> Class<? super E> iteration(Class<? super E> c) {
  return c == null ? null : c.getSuperclass();
}
private static boolean canAssign(Field f) {
  return f == null || 
    f.getType().isAssignableFrom(TransletOutputHandlerFactory.class);
}
private static <E> Stream<Field> classToFields(Class<? super E> c) {
  return c == null ? Stream.of((Field) null) : 
    Arrays.stream(c.getDeclaredFields());
}

What this is doing is pretty much just customizing the mapping of namespaces to prefixes. Every prefix is mapped to a namespace identified by its prefix, so there shouldn't even be any conflicts. The rest of it is fighting the API.

To make the example complete, here are the methods converting to and from the XML as well:

public static Transformer createTransformer() 
  throws TransformerFactoryConfigurationError, 
    TransformerConfigurationException {

  TransformerFactory factory = TransformerFactory.newInstance();
  Transformer transformer = factory.newTransformer();
  transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
  transformer.setOutputProperty(OutputKeys.INDENT, "no");
  return transformer;
}
public static ArrayList<Node> parseNodes(String uri, String expression)
  throws ParserConfigurationException, SAXException, 
    IOException,XPathExpressionException {

  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
  factory.setNamespaceAware(false);
  DocumentBuilder builder = factory.newDocumentBuilder();
  Document doc = builder.parse(uri);
  XPathFactory xPathfactory = XPathFactory.newInstance();
  XPath xpath = xPathfactory.newXPath();
  XPathExpression expr = xpath.compile(expression);
  NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
  ArrayList<Node> nodes = new ArrayList<>();
  for(int i = 0; i < nl.getLength(); i++) {
    nodes.add(nl.item(i));
  }
  return nodes;
}

Upvotes: 3

Michael Kay
Michael Kay

Reputation: 163262

No, you can't ignore namespaces.

If the namespace declaration xmlns='http://www.lotus.com/dxl' appears in the outermost element, then you can't say it "isn't used anywhere" - on the contrary, it is used everywhere! It effectively changes every element name in the document to a different name. There's no way you can ignore that.

If you were using XSLT 2.0, then you would be able to say in your stylesheet xpath-default-namespace="http://www.lotus.com/dxl" which would pretty much do what you want: it says that any unprefixed name in a match pattern or XPath expression should be interpreted as referring to a name in namespace http://www.lotus.com/dxl. Sadly, you've chosen an XSLT processor that doesn't implement XSLT 2.0. So you'll have to do it the hard way (which is described in about 10,000 posts that you will find by searching for "XSLT default namespace").

Upvotes: 4

Related Questions