Yalmar
Yalmar

Reputation: 435

XSLT Concatenate text

I have a large number of html files like the following:

<html>
  <head>
    <title>t</title>
  </head>
  <body>
    <div class="a">
      <div class="b" type="t1">
        b11<div class="x">x</div>
        b12<div class="y">y</div>b13
      </div>
      <div class="c">c</div>
    </div>
    <div class="b" type="t2" region="r">b21
      <div class="x">x</div>b22
      <div class="y">y</div>
      b23
    </div>
  </body>
</html>

At present the text for div class="b" is fragmented at the beginning, middle and end of the node. I want to consolidate the text for div class="b" so that it appears at the beginning. The file I want to obtain is like the following:

<html>
  <head>
    <title>t</title>
  </head>
  <body>
    <div class="a">
      <div class="b" type="t1">b11 b12 b13
        <div class="x">x</div>
        <div class="y">y</div>
      </div>
      <div class="c">c</div>
    </div>
    <div class="b" type="t2" region="r">b21 b22 b23
      <div class="x">x</div>
      <div class="y">y</div>
    </div>
  </body>
</html>

I run the following bash script a.sh:

xsltproc a.xslt a.html > b.html

where a.xslt is the following:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="//div[@class='b']">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
   <xsl:for-each select="text()">
    <xsl:if test="position() &gt; 1"><xsl:text> </xsl:text></xsl:if>
    <xsl:value-of select="normalize-space(.)"/>
   </xsl:for-each>
 </xsl:template>

</xsl:stylesheet>

Unfortunately my output is not what I want:

<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type" />
    <title>t</title>
  </head>
  <body>
    <div class="a">
      <div class="b" type="t1">b11
        <div class="x">x</div>
        b12
        <div class="y">y</div>
        b13</div>
      b11 b12 b13
      <div class="c">c</div>
    </div>
    <div class="b" region="r" type="t2">b21
      <div class="x">x</div>
      b22
      <div class="y">y</div>
      b23</div>
    <p>b21 b22 b23</p>
  </body>
</html>

Do you have any advice on how to proceed please?

Upvotes: 0

Views: 100

Answers (1)

michael.hor257k
michael.hor257k

Reputation: 116959

Would this work for you?

<xsl:template match="div[@class='b']">
    <xsl:copy>
        <xsl:apply-templates select="@*"/>
        <xsl:for-each select="text()">
            <xsl:if test="position() &gt; 1">
                <xsl:text> </xsl:text>
            </xsl:if>
            <xsl:value-of select="normalize-space(.)"/>
        </xsl:for-each>
        <xsl:apply-templates select="*"/>
    </xsl:copy>
</xsl:template>

Upvotes: 1

Related Questions