Reputation: 25
Got stuck, need advice please. Not an expert in XSLT I'm afraid.
I have this sample XML:
<?xml version="1.0" encoding="UTF-8"?>
<a>
<b key="x"/>
<b key="y"> text></b>
<b key="z"><p>This contains HTML Code.<br/><br/>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. <br/><br/>At vero eos et accusam et justo duo dolores et ea rebum. <br/>///<br/>///Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.///<br/> Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. <br/>At vero eos et accusam et justo duo dolores et ea rebum. <br/>Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.</p></b>
</a>
I need to output the contents of "/a[1]/b[@key='z']"
and transform all <p></p>
- and all <br/>
-blocks into UNIX line breaks 

/
. And at the same time ensure the output code has no characters of type /
or |
or \
or&0x9;
(horizontal tab).
All /|\
-characters shall be replaced by hyphens -
, and all horizontal tab characters shall be removed altogether.
This is my current XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
<xsl:variable name="foo" select="/a[1]/b[@key='z']" />
<xsl:apply-templates select="$foo" />
</xsl:template>
<xsl:template match="/a[1]/b[@key='z']">
<xsl:value-of select="translate(.,'/\|	', '---')"/>
<xsl:apply-templates />
</xsl:template>
<xsl:template match="br">
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
The output I'm getting does have both results, one with the substitutions and one with the line breaks inserted:
<?xml version="1.0" encoding="UTF-8"?>This contains HTML Code.Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. ------Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.--- Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
This contains HTML Code.
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
At vero eos et accusam et justo duo dolores et ea rebum.
///
///Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.///
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
At vero eos et accusam et justo duo dolores et ea rebum.
Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
Please, how can I insert the line breaks and replace the unwanted characters?
Upvotes: 2
Views: 72
Reputation: 57149
You can fix this by processing the text nodes and explicitly matching them. Since you only need to translate the text nodes once, this is a working solution to your requirement.
With the below code I also simplified your matching template with b
, because from the root node you already make a selection and there is no other place where incorrect b
nodes are applied.
And I added matching p
, which needs to process its children (see code). You may want to split that one up, as perhaps you need to newlines with a p
instead.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/">
<xsl:variable name="foo" select="/a[1]/b[@key='z']" />
<xsl:apply-templates select="$foo" />
</xsl:template>
<!-- this currently does nothing and can be removed,
unless in your code something more is done here -->
<xsl:template match="b">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="br | p">
<xsl:text>
</xsl:text>
<xsl:apply-templates />
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="translate(.,'/\|	', '---')"/>
</xsl:template>
</xsl:stylesheet>
This gives:
This contains HTML Code.
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
At vero eos et accusam et justo duo dolores et ea rebum.
---
---Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.---
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
At vero eos et accusam et justo duo dolores et ea rebum.
Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
Upvotes: 1
Reputation: 116959
Try it this way?
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:template match="/a">
<xsl:apply-templates select="b[@key='z']"/>
</xsl:template>
<xsl:template match="br">
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="translate(., '/|\	', '----')"/>
</xsl:template>
</xsl:stylesheet>
The result, when applied to your example, will be:
This contains HTML Code.
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
At vero eos et accusam et justo duo dolores et ea rebum.
---
---Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.---
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
At vero eos et accusam et justo duo dolores et ea rebum.
Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
Note: an XML document can have only one root element, so the predicate in /a[1]
is redundant. And so is the '$foo' variable that is used only once.
Upvotes: 1