Reputation: 513
I have a HTML file which I transform with XSL to another HTML file (just tweaking its structure so it looks good in most email clients).
HTML is received from other system and I can't modify how this HTML is generated.
My problem is with tags containing
inside. XSL transformation is ignoring it.
HTML input:
<span style="font-family: 'HelveticaNeue LT 45 Lt', serif; font-size: 12px; color:#000000">
IMPORTANT: The loan is repayable by 10 payments. The first Direct Debit payment will be collected along with other payments that are already due on...
</span>
HTML output:
<p class="bodytext" align="justify" style="font-size:14px; font-weight:200; font-align:justify;">
IMPORTANT: The loan is repayable by 10 payments.The first Direct Debit payment will be collectedalong with any other payments that are already dueon...
</p>
Both spaces are missing and text is concatenated: payments.The
and dueon
.
XSL is a bit complex.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:character-map name="escape">
<xsl:output-character character="€" string="&euro;"/>
<xsl:output-character character=" " string="&nbsp;"/>
</xsl:character-map>
<xsl:output method="html" indent="yes" use-character-maps="escape"/>
<xsl:template match="body">
<html>
<head>
<meta name="generator" content="HTML EMail optimization by" />
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
...
...
<xsl:template name="TextTemplate">
<xsl:if test="not(starts-with(.,'XSLTButton'))">
<xsl:value-of select="text()"/>
</xsl:if>
</xsl:template>
I'm using Saxon 9.1.0.8
After some googling I've tried to use xsl:character-map
,xsl:preserve-space
, change encoding.. but nothing worked.
The only thing that worked is to add [<!ENTITY nbsp " ">]
declaration to the DOCTYPE in the INPUT html, but I don't want to create additional step in the process just to add this bit.
Please help. What should I add so XSL/Saxon stop ignoring
?
Upvotes: 1
Views: 677
Reputation: 25034
DTD-aware XML parsers require that entities to which the document refers be declared. XSLT processing requires that entity references be expanded, so conforming XSLT processors normally use conforming DTD-aware XML parsers for their front end. If you continue feeding the processing input which uses an undeclared entity, then you are going to continue to get unsatisfactory results.
If the input already has a document type declaration with a reference to an appropriate DTD, then you should try using a DTD-aware parser. If not, you can inject such a document type declaration, or you can run the HTML through Tidy or some similar processor which assumes the HTML DTD and expands all entity references.
Upvotes: 2