embden
embden

Reputation: 161

XSLT: strip tags, preserve whitespace

Using XSLT, I'm trying to strip out all tags within a particular node, while preserving the whitespace between those tags.

I'm given XML like this:

<text>
<s id="s2"> The patient is a <p id="p22">56-year-old</p> <p id="p28">Caucasian</p> <p id="p30">male</p></s></text>

I'd like to strip out all of the <s> and <p> tags so that I just have the English sentence within the <text> node.

I've tried the following template, which does successfully remove all of the tags, but it also removes the spaces between the <p> tags if there are no other characters there. For example, I would end up with: " The patient is a 56-year-oldCaucasianmale"

<xsl:template name="strip-tags">
    <xsl:param name="text"/>
    <xsl:choose>
        <xsl:when test="contains($text, '&lt;')">
            <xsl:value-of select="substring-before($text, '&lt;')"/>
            <xsl:call-template name="strip-tags">
                <xsl:with-param name="text" select="substring-after($text, '&gt;')"/>
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="$text"/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

Any thoughts? Thanks!

Upvotes: 2

Views: 2246

Answers (1)

Ian Roberts
Ian Roberts

Reputation: 122364

The text content with whitespace preserved but tags removed is precisely the definition of "string value" for an element node. So you could simply use

<xsl:value-of select="$text" />

(assuming $text contains the <text> element node). This also assumes that you don't have

<xsl:strip-space elements="*"/>

in your stylesheet, as that would strip the whitespace-only text nodes between the various pairs of </p> <p> tags.

Upvotes: 1

Related Questions