Reputation: 2459
I'm using Saxon HE 9.5.1.8 to transform an XML to another XML file.
My problem is that the XML content written by the Serializer() class of Saxon prints out several additional indents that I don't want to have in there. I'm assuming that this is "wrong" because I got the expected output when using the DomDestination() class (but then the outer XML document information is missing) or other XSL transformers like the one that is shipped with Visual Studio / .NET Framework.
This is the input XML:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>$44.95</price>
<publish_date>2000-10-01</publish_date>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>$5.95</price>
<publish_date>2000-12-16</publish_date>
</book>
This is the XLST file:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="book">
<book>
<xsl:copy-of select="@*|book/@*" />
<xsl:for-each select="*">
<xsl:attribute name="{name()}">
<xsl:value-of select="text()"/>
</xsl:attribute>
</xsl:for-each>
</book>
</xsl:template>
</xsl:stylesheet>
That is the expected output:
<?xml version="1.0" encoding="utf-8"?>
<catalog>
<book id="bk101" author="Gambardella, Matthew" title="XML Developer's Guide" genre="Computer" price="$44.95" publish_date="2000-10-01" />
<book id="bk102" author="Ralls, Kim" title="Midnight Rain" genre="Fantasy" price="$5.95" publish_date="2000-12-16" />
</catalog>
And that is the output when using Saxon:
<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<book id="bk101"
author="Gambardella, Matthew"
title="XML Developer's Guide"
genre="Computer"
price="$44.95"
publish_date="2000-10-01"/>
<book id="bk102"
author="Ralls, Kim"
title="Midnight Rain"
genre="Fantasy"
price="$5.95"
publish_date="2000-12-16"/>
</catalog>
Does anybody know how to suppress or modify this behavior of Saxon? That is the C# code that is used to call the Saxon API:
public Stream Transform(string xmlFilePath, string xsltFilePath)
{
var result = new MemoryStream();
var xslt = new FileInfo(xsltFilePath);
var input = new FileInfo(xmlFilePath);
var processor = new Processor();
var compiler = processor.NewXsltCompiler();
var executable = compiler.Compile(new Uri(xslt.FullName));
var destination = new Serializer();
destination.SetOutputStream(result);
using(var inputStream = input.OpenRead())
{
var transformer = executable.Load();
transformer.SetInputStream(inputStream, new Uri(input.DirectoryName));
transformer.Run(destination);
}
result.Position = 0;
return result;
}
Upvotes: 2
Views: 837
Reputation: 163488
Your goal of having multiple processors produce output in the same format is hopelessly misguided. That's especially so if you choose indented output: the spec leaves it entirely to implementations how to do indentation, saying only that the goal is to make it human-readable. (And placing constraints on where extra whitespace can be inserted.)
I'm sorry you don't find Saxon's way of wrapping long attribute lists pleasing, but it is entirely within the letter and the spirit of the specification. Without it, if you have an element with eight namespace declarations, you can easily get a line that is 400 characters long, which I certainly don't regard as human-readable.
There are many reasons that comparing two XML documents lexically is never going to work. For example, the attributes can be in a different order. There are two ways of comparing XML: convert the documents into canonical form using a "Canonical XML" processor, or compare them at the tree level for example by using the XPath 2.0 deep-equal() function. Ideally (especially if you want to know where the differences are, rather than just whether differences exist), use a specialist XML comparison tool such as DeltaXML.
For what it's worth, when we do unit testing, we first attempt a lexical comparison of the results. If that fails, we parse both documents and compare them using saxon:deep-equal(), which is a modified form of the deep-equal() function that gives fine control over the comparison rules, e.g. handling of whitespace and handling of namespaces.
Upvotes: 2
Reputation: 167696
Try setting http://saxonica.com/documentation9.5/extensions/output-extras/line-length.html to a very large value to avoid that attributes are put on a new line: <xsl:output xmlns:saxon="http://saxon.sf.net/" saxon:line-length="1000"/>
.
Upvotes: 3