Drazen Grabovac
Drazen Grabovac

Reputation: 153

Xslt transformation looses special characters

I'm doing XSLT transformations and something is wrong with the encoding since I'm loosing croatian special characters after the transformation. I'm using javax.xml.transform.Transformer and I'm setting the encoding like this:

transformer.setOutputProperty( OutputKeys.ENCODING, "UTF-8");

We are using Websphere 8, and the following JVM arguments are defined

-Dclient.encoding.override=UTF-8
-Dfile.encoding=UTF-8

Also the transformation is defined as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:msg="http://b25/ics/ed/CC305A" xmlns:ct="http://b25/ics/complexTypes">
<xsl:output encoding="UTF-8" indent="yes" method="xml" />
...

how can I solve this problem?

Upvotes: 0

Views: 902

Answers (1)

Michael Kay
Michael Kay

Reputation: 163262

The loss (or miscoding) is happening either before the data gets into the XSLT engine, or after it leaves it. (Character encoding problems almost invariably arise on the boundaries between software products, when the supplier of data thinks it is in one encoding and the receiver believes it is in a different encoding). So the first step in resolving the problem is to find out which is the case. It's easy enough to find out precisely what's in the input: use something like <xsl:comment><xsl:value-of select="string-to-codepoints(.)"/></xsl:comment> which will tell you the integer Unicode codepoints you have supplied to the transformation. To find out precisely what's in the output, you need to look at the serialized output of the XSLT engine in a hex editor.

Upvotes: 1

Related Questions