Reputation: 9888
In the process of trying to make a stylesheet to convert old LoC transcriptions of books that used a very-outdated SGML DTD for formatting, I've run into a roadblock at the following situation:
In the converted XML files, there are some lines of text like the following:
<p> Text on left <hsep></hsep> Text on right </p>
hsep essentially pushes the remaining text to be right-justified. Unfortunately, I don't know of any way to convert this to HTML by just converting tags, as HTML has nothing like hsep short of dubious CSS hacks. I think it would be more useful to be able to convert this to something like:
<p> Text on left <span class="right">Text on right</span> </p>
However, I'm not sure how to do this, as it would require that, in the <p>
element, I determine whether there's an <hsep>
and then create a tag surrounding the remaining text based on it being there, while also applying templates to any elements that might be there. I don't think cases where I have something like
<p> Text a <em> Text b <hsep></hsep> Text c </em> </p>
are common or even present, so I don't think that will pose a problem, but there may be situations like:
<p> <em> Text a Text b <hsep></hsep> Text c </em> </p>
I can think of complicated, horrible ways of doing this involving regexes, but I'm hoping there's a non-horrible way.
Upvotes: 2
Views: 325
Reputation:
create a tag surrounding the remaining text based on it being there, while also applying templates to any elements that might be there
I think that for better foward processing you could use this stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()|@*" name="identity">
<xsl:copy>
<xsl:apply-templates select="node()[1]|@*"/>
</xsl:copy>
<xsl:apply-templates select="following-sibling::node()[1]"/>
</xsl:template>
<xsl:template match="hsep">
<span class="right">
<xsl:apply-templates select="following-sibling::node()[1]"/>
</span>
</xsl:template>
</xsl:stylesheet>
With Dimitre's input:
<html>
<p> Text a <em> Text b <hsep></hsep> Text c </em> </p>
<p> <em> Text a Text b <hsep></hsep> Text c </em> </p>
</html>
Output:
<html>
<p> Text a <em> Text b <span class="right"> Text c </span></em></p>
<p><em> Text a Text b <span class="right"> Text c </span></em></p>
</html>
Note: With out mode you can declare a rule once for elements whether preceding or following hsep
.
Upvotes: 2
Reputation: 243549
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*" name="identity">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="hsep">
<span class="right">
<xsl:apply-templates mode="copy"
select="following-sibling::node()"/>
</span>
</xsl:template>
<xsl:template match="node()[preceding-sibling::hsep]"/>
<xsl:template mode="copy"
match="node()[preceding-sibling::hsep]">
<xsl:call-template name="identity"/>
</xsl:template>
</xsl:stylesheet>
when applied on this document:
<html>
<p> Text a <em> Text b <hsep></hsep> Text c </em> </p>
<p> <em> Text a Text b <hsep></hsep> Text c </em> </p>
</html>
produces the wanted, correct result:
<html>
<p> Text a <em> Text b <span class="right"> Text c </span></em></p>
<p><em> Text a Text b <span class="right"> Text c </span></em></p>
</html>
Upvotes: 1