JimS
JimS

Reputation: 1143

Using Saxon and XSLT to transform JDOM XML documents

I'm trying to convert some XML so that iso8879 entity strings will appear in place of characters. For example the string 1234-5678 would become 1234‐5678. I've done this using character maps and the stylesheets found at http://www.w3.org/2003/entities/iso8879doc/overview.html.

The first part of my xslt looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:import href="iso8879map.xsl"/>  
    <xsl:output omit-xml-declaration = "yes" use-character-maps="iso8879"/>

When I run this stylesheet in Eclipse with the Saxon XSLT engine it works fine and outputs an XML file with the hyphen entitiy string in place of the hyphen character. However, I need to automate this process so am using the JDOM package. Unfortunately, the characters are not being replaced during the transformation. The code that does the conversion looks a little like this:

System.setProperty("javax.xml.transform.TransformerFactory",
    "net.sf.saxon.TransformerFactoryImpl");  // use saxon for xslt 2.0 support


SAXBuilder builder = new SAXBuilder();
builder.setExpandEntities(false);       
XSLTransformer transformer = new XSLTransformer(styleSheet);

Document toTransform = builder.build(Fileref); // transform
Document transformed = transformer.transform(toTransform);

I then write the document out to a file using the following method:

public static void writeXMLDoc(File xmlDoc, Document jdomDoc){

    try {
        Format format = Format.getPrettyFormat();
        format.setOmitDeclaration(true);
        format.setEncoding("ISO-8859-1");
        XMLOutputter outputter = new XMLOutputter(format);
        //outputter.output((org.jdom.Document) allChapters, System.out);
        FileWriter writer = new FileWriter(xmlDoc.getAbsolutePath());
        outputter.output((org.jdom.Document) jdomDoc, writer);
        writer.close();
    } 
    catch (java.io.IOException exp) {
        exp.printStackTrace();
    }
}

I've started debugging in Eclipse and it looks like the hyphen character isn't being replaced during the xslt transformation. I've tested this using the Saxon xslt engine on it's own and it does work, so it's likely something to do with using it from Java and Jdom. Can anybody help?

Many thanks.

Jim

Upvotes: 0

Views: 5733

Answers (1)

JimS
JimS

Reputation: 1143

The problem did turn out to be with not using the JDOM wrapper class provided by Saxon. Here's the working code for reference that shows a JDOM document being transformed and being returned as a new JDOM document:

System.setProperty("javax.xml.transform.TransformerFactory", "net.sf.saxon.TransformerFactoryImpl");  // use saxon for xslt 2.0 support
File styleSheet = new File("filePath");

// Get a TransformerFactory
System.setProperty("javax.xml.transform.TransformerFactory",
                   "com.saxonica.config.ProfessionalTransformerFactory");
TransformerFactory tfactory = TransformerFactory.newInstance();
ProfessionalConfiguration config = (ProfessionalConfiguration)((TransformerFactoryImpl)tfactory).getConfiguration();

// Get a SAXBuilder 
SAXBuilder builder = new SAXBuilder(); 

//Build JDOM Document
Document toTransform = builder.build(inputFileHandle); 

//Give it a Saxon wrapper
DocumentWrapper docw = new DocumentWrapper(toTransform,  inputHandle.getAbsolutePath(), config);

// Compile the stylesheet
Templates templates = tfactory.newTemplates(new StreamSource(styleSheet));
Transformer transformer = templates.newTransformer();

// Now do a transformation
ByteArrayOutputStream outStream = new ByteArrayOutputStream(1024);                  
transformer.transform(docw, new StreamResult(outStream));

ByteArrayInputStream inStream = new ByteArrayInputStream(outStream.toByteArray());
Document transformed = builder.build(inStream);

Upvotes: 2

Related Questions