lambshaanxy
lambshaanxy

Reputation: 23072

Trim whitespace from parent element only

I'd like to trim the leading whitespace inside p tags in XML, so this:

<p>  Hey, <em>italics</em> and <em>italics</em>!</p>

Becomes this:

<p>Hey, <em>italics</em> and <em>italics</em>!</p>

(Trimming trailing whitespace won't hurt, but it's not mandatory.)

Now, I know normalize-whitespace() is supposed to do this, but if I try to apply it to the text nodes..

<xsl:template match="text()">
  <xsl:text>[</xsl:text>
  <xsl:value-of select="normalize-space(.)"/>
  <xsl:text>]</xsl:text>
</xsl:template>

...it's applied to each text node (in brackets) individually and sucks them dry:

[Hey,]<em>[italics]</em>[and]<em>[italics]</em>[!]

My XSLT looks basically like this:

<xsl:template match="p">
    <xsl:apply-templates/>
</xsl:template>

So is there any way I can let apply-templates complete and then run normalize-space on the output, which should do the right thing?

Upvotes: 2

Views: 1419

Answers (3)

LarsH
LarsH

Reputation: 28004

I would do something like this:

<xsl:template match="p">
    <xsl:apply-templates/>
</xsl:template>

<!-- strip leading whitespace -->
<xsl:template match="p/node()[1][self::text()]">
  <xsl:call-template name="left-trim">
     <xsl:with-param name="s" value="."/>
  </xsl:call-template>
</xsl:template>

This will strip left space from the initial node child of a <p> element, if it is a text node. It will not strip space from the first text node child, if it is not the first node child. E.g. in

<p><em>Hey</em> there</p>

I intentionally avoid stripping the space from the front of 'there', because that would make the words run together when rendered in a browser. If you did want to strip that space, change the match pattern to

match="p/text()[1]"

If you also want to strip trailing whitespace, as your title possibly implies, add these two templates:

<!-- strip trailing whitespace -->
<xsl:template match="p/node()[last()][self::text()]">
  <xsl:call-template name="right-trim">
     <xsl:with-param name="s" value="."/>
  </xsl:call-template>
</xsl:template>

<!-- strip leading/trailing whitespace on sole text node -->
<xsl:template match="p/node()[position() = 1 and
                              position() = last()][self::text()]"
              priority="2">
   <xsl:value-of select="normalize-space(.)"/>
</xsl:template>

The definitions of the left-trim and right-trim templates are at Trim Template for XSLT (untested). They might be slow for documents with lots of <p>s. If you can use XSLT 2.0, you can replace the call-templates with

  <xsl:value-of select="replace(.,'^\s+','')" />

and

  <xsl:value-of select="replace(.,'\s+$','')" />

(Thanks to Priscilla Walmsley.)

Upvotes: 4

user357812
user357812

Reputation:

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="p//text()[1][generate-id()=
                                      generate-id(ancestor::p[1]
                                                  /descendant::text()[1])]">
        <xsl:variable name="vFirstNotSpace"
                      select="substring(normalize-space(),1,1)"/>
        <xsl:value-of select="concat($vFirstNotSpace,
                                     substring-after(.,$vFirstNotSpace))"/>
    </xsl:template>
</xsl:stylesheet>

Output:

<p>Hey, <em>italics</em> and <em>italics</em>!</p>

Edit 2: Better expression (now only three function calls).

Edit 3: Matching the first descendant text node (not just the first node if it's a text node). Thanks to @Dimitre's comment.

Now, with this input:

<p><b>  Hey, </b><em>italics</em> and <em>italics</em>!</p>

Output:

<p><b>Hey, </b><em>italics</em> and <em>italics</em>!</p>

Upvotes: 5

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243579

You want:

 <xsl:template match="text()">
  <xsl:value-of select=
   "substring(
       substring(normalize-space(concat('[',.,']')),2),
       1,
       string-length(.)
              )"/>
 </xsl:template>

This wraps the string in "[]", then performs normalize-string(), then finally removes the wrapping characters.

Upvotes: 2

Related Questions