ursitesion
ursitesion

Reputation: 988

XSLT transformation of ISO-8859-1

Can we make XSLT transformation with ISO-8859-1 unicode instead UTF-8?

Currently, I am not facing any issue when using UTF-8. Below code works fine:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:this="http://this.com"
xmlns:wd="urn:com.workday.report/abcd_services" version="2.0">

    <xsl:output method="text" indent="yes" encoding="UTF-8"/>  

While below code gives error:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:this="http://this.com"
xmlns:wd="urn:com.workday.report/INT1204_GE_Capital_Fleet_Services" version="2.0">

    <xsl:output method="text" indent="yes" encoding="ISO-8859-1"/>    

Upvotes: 1

Views: 3901

Answers (1)

Abel
Abel

Reputation: 57159

Every processor I know of supports ISO-8859-1, US-ASCII (compulsary), CP1252 and usually many variants thereof, because from a processor's perspective these are just one-byte encodings which, apart from a translation table, makes it trivial to implement.

That leaves us with the error which you,unfortunately, haven't shown. So let's go over a few options:

<?xml version="1.0" encoding="ISO-8859-1"?>

You wrote this as the prolog of your stylesheet. While in itself not illegal, it serves no purpose and it won't have any effect on how the processor processes any input XML or output. However, it does severely restrict the characters you are allowed to use.

Suppose you would have saved your original stylesheet as UTF-8 and then with a BOM, using some none XML-aware editor you changed it to ISO-8859-1, it will be illegal and you will receive something like: "F [Xerces] The processing instruction target matching "[xX][mM][lL]" is not allowed.", or "Content not allowed before prolog".

As a general rule, just leave your stylesheet in the best encoding available for your task, which will typically UTF-8, because any tool using XML is required to be able to process that, and since this is a stylesheet, any XSLT processor will be able to process that.

<xsl:output method="text" indent="yes" encoding="ISO-8859-1"/> 

This you wrote in the stylesheet itself. If the method would be set to XML or HTML, it would never (hardly ever) cause an error, as then any character in your stylesheet would be escaped as numerical entities: let's say you have "ٺٻټٽ", it would become &#x067A;&#x067B;&#x067C;&#x067D; (or decimal equiv.), because these characters are not available in ISO-8859-1.

But you set the method to text. Nothing wrong with that per se, but the characters available to that encoding are severely limited. Suppose you have your current ISO-8859-1 stylesheet correctly encoded (i.e., the stylesheet compiles), but you have something like this:

<!-- not allowed with your text output -->
<xsl:value-of select="'&#x100;&#x101;" />

which is the equivalent of this:

<!-- won't compile -->
<xsl:value-of select="'Āā'" />

Now in the first case, this will throw an error. For instance, my own processor Exselt will throw:

Serialization Exception: A character 'Ā' cannot be represented in the used encoding in a context where character references are not allowed.

And Saxon will throw:

Output character not available in this encoding (decimal 256)

My guess is that one of the above scenarios applies to you. If you explicitly want to use a lesser encoding, then make sure that you are not doing anything illegal. If this doesn't help, please update your question (which I recommend you should do anyway) with the exact error, used processor and how to reproduce it.

Upvotes: 2

Related Questions