plat.geo
plat.geo

Reputation: 35

CSV to XML transformation fails in Saxon when special characters are present

I am evaluating Andrew Welch's CSV to XML-Converter in XSLT 2.0:

It works for me so far if I convert a CSV with no special characters like this:

ID,    Title, Type
152733,Test1,Type1
152757,Test3,Type2
152759,Test4,Type2

But if I try to convert a CSV with a German "Umlaut" like this:

ID,    Title,Type
152733,Test1,Type1
152757,Test3,Type2
152759,Täst4,Type2

the output is "Cannot locate : test12.csv".

So it seems to me the function fn:unparsed-text-available does not work if the text contains special characters. Any idea how to fix this?

Saxon version is Saxon-HE 9.7.0.1.

Upvotes: 2

Views: 264

Answers (1)

Tomalak
Tomalak

Reputation: 338316

Pass the file encoding to unparsed-text().

I'm making an educated guess(*) here:

<xsl:variable name="csv" select="unparsed-text($pathToCSV, 'Windows-1252')" />

(*) UTF-8 is the default for the $encoding parameter of unparsed-text(). That means if reading the file fails then it clearly isn't UTF-8, but in a legacy (i.e. single-byte) encoding. German umlauts suggest the file was created in a typical "Western Europe" configuration, where either Windows-1252 or iso-8859-1 are the default legacy encodings.

Upvotes: 3

Related Questions