Reputation: 522
I have a simple XML workflow which I need to convert to HTML, but the XML contains one or more entities that I'd like to strip out. For example, the original XML contains bullet entities •
I want to replace or delete the actual bullets with HTML bullets using <ul><li>
elements instead.
Here is the XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" encoding="UTF-8"/>
<xsl:template match="/">
<xsl:apply-templates select="RULEBOOK"/>
</xsl:template>
<xsl:template match="text()[starts-with(., '• ')]">
<xsl:value-of select="substring-after(., '• ')"/>
</xsl:template>
<xsl:template match="bullets">
<li><xsl:apply-templates/></li>
</xsl:template>
<xsl:template match="ul">
<ul><xsl:apply-templates/></ul>
</xsl:template>
</xsl:stylesheet>
Here is the XML:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<RULEBOOK>
<SECTION>
<Subsection>
<text>The game of golf should be played in the correct spirit and to understand this you should read the Etiquette Section in the Rules of Golf. In particular: </text>
<ul>
<bullets>• show consideration to other players </bullets>
<bullets>• play at a good pace and be ready to invite faster moving groups to play through, and </bullets>
<bullets>• take care of the course by smoothing bunkers, replacing divots and repairing ball marks on the greens. </bullets>
</ul>
<text>Before starting your round you are advised to: </text>
<ul>
<bullets>• read the Local Rules on the score card and the notice board </bullets>
<bullets>• put an identification mark on your ball; many golfers play the same brand of ball and if you can’t identify your ball, it is considered lost (Rules <a>12-2</a> and <a>27-1</a>) </bullets>
<bullets>• count your clubs; you are allowed a maximum of 14 clubs (Rule <a>4-4</a>).
</bullets>
</ul>
<text>During the round: </text>
<ul>
<bullets>• don’t ask for advice from anyone except your partner (i.e., a player on your side) or your caddies; don’t give advice to anyone except your partner; you may ask for information on the Rules, distances and the position of hazards, the flagstick, etc. (Rule <a>8-1</a>) </bullets>
<bullets>• don’t play any practice shots during play of a hole (Rule <a>7-2</a>) </bullets>
<bullets>• don’t use any artificial devices or unusual equipment, unless specifically authorized by Local Rule (Rule <a>14-3</a>). </bullets>
</ul>
<text>At the end of your round: </text>
<ul>
<bullets>• in match play, ensure the result of the match is posted </bullets>
<bullets>• in stroke play, ensure that your score card is completed properly (including being signed by you and your marker) and return it to the Committee as soon as possible (Rule <a>6-6</a>). </bullets>
</ul>
</Subsection>
</SECTION></RULEBOOK>
Upvotes: 1
Views: 865
Reputation: 23637
You can add this template:
<xsl:template match="text()[starts-with(., '• ')]">
<xsl:value-of select="substring-after(., '• ')"/>
</xsl:template>
which will find any text node that starts with the bullet and spaces, and preserve only the substring after it.
@keshlam suggested a solution using , which is better since it doesn't depend on spaces or fail if there are any characters before the bullet (but it will remove bullets anywhere in the text, not just at the start):
<xsl:template match="text()[contains(., '•')]">
<xsl:value-of select="normalize-space(translate(., '•',''))"/>
</xsl:template>
The normalize-space()
function will trim your text removing the extra spaces or tabs.
This works in XSLT 1.0 processors such as Xalan or Saxon 6.
UPDATE
Here is a full stylesheet (actually the same one you posted, with the last template above included):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" encoding="UTF-8"/>
<xsl:template match="/">
<xsl:apply-templates select="RULEBOOK"/>
</xsl:template>
<xsl:template match="text()[contains(., '•')]">
<xsl:value-of select="normalize-space(translate(., '•',''))"/>
</xsl:template>
<xsl:template match="bullets">
<li><xsl:apply-templates/></li>
</xsl:template>
<xsl:template match="ul">
<ul><xsl:apply-templates/></ul>
</xsl:template>
</xsl:stylesheet>
It works with your source, when it is copied from the page and pasted in a new file. If it doesn't work in your original file which has the exact same contents, it may be that your original file has a different encoding.
Upvotes: 1