Reputation: 31
Is it possible for XSLT preserve anchors and other embedded HTML tags within XML?
Background: I am trying to convert an HTML document into XML with an XSL stylesheet using XSLT. The original HTML document had content interspersed with anchor tags (e.g. Some hyperlinks here and there). I've copied that content into my XML, but the XSLT output lacks anchor tags.
Example XML:
<?xml version="1.0" ?>
<observations>
<observation><a href="http://jwz.org">Hyperlinks</a> disappear.</observation>
</observations>
Example XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/html">
<xsl:output method="html" indent="yes" encoding="UTF-8"/>
<xsl:template match="/observations">
<html>
<body>
<xsl:value-of select="observation"/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Output:
<html xmlns="http://www.w3.org/1999/html">
<body>Hyperlinks disappear.</body>
</html>
I've read a few similar articles on stackoverflow and checked out the Identity transform page on wikipedia; I started to get some interesting results using xsl:copy-of, but I don't understand enough about XSLT to get all of the words and tags embedded within each XML element to appear in the resulting HTML. Any help would be appreciated.
Upvotes: 3
Views: 1423
Reputation: 22617
Write a separate template to match a
elements, copy their attributes and content.
What is wrong with your approach? In your code,
<xsl:value-of select="observation"/>
simply sends to the output the string value of the observation
element. Its string value is the concatenation of all text nodes it contains. But you need not only the text nodes in it, but also the a
elements themselves.
The default behaviour of an XSLT processor is to "skip" element nodes, because of a built-in template. So, if you do not mention a
in a template match, it is simply ignored and only its text content is output.
Stylesheet
Note: This stylesheet still relies on the default behaviour of the XSLT processor to some extent. The order of events will resemble the following:
The template where
match="/observations"
is matched. It addshtml
andbody
to the output. Then, a template rule must be found for the content ofobservations
. A built-in template matchesobservation
, does nothing with it, and looks for a template to process its content. For thea
element, the corresponding template is matched, with copies the element and attributes. Finally, a built-in template copies the text nodes insideobservation
anda
.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes" encoding="UTF-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/observations">
<html>
<body>
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="a">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
XML Output
<html>
<body><a href="http://jwz.org">Hyperlinks</a> disappear.
</body>
</html>
Upvotes: 2