Remove string before second separator in a node

I would like to remove string in a node after second separator ( | ):

<A>
<B>First | Second | Third<B>
</A>
<A>
<B>Apple | Orange | Bananas | Kiwi<B>
</A>
<A>
<B>Example<B>
</A>

Output:

<A>
<B>First | Second<B>
</A>
<A>
<B>Apple | Orange<B>
</A>
<A>
<B>Example<B>
</A>

My first idea was to use regex:

<xsl:template match="B">
    <xsl:value-of select="replace(., '\|([^|]*)$', '')" />
</xsl:template>

...but it's not really working, maybe there is a better way to do this?

Upvotes: 1

Views: 232

Answers (2)

zx485
zx485

Reputation: 29042

To remove the rest of the string after the first separator |, you can of course use a RegEx:

<xsl:template match="B">
  <xsl:copy>
    <xsl:value-of select="replace(., '(.*?)\s?\|.*?$', '$1')" />
  </xsl:copy>
</xsl:template>

Its output is

<root>
    <A>
        <B>First</B>
    </A>
    <A>
        <B>Apple</B>
    </A>
    <A>
        <B>Example</B>
    </A>
</root>

If you want, on the other hand, get the output you gave in your question, you can use a variant of the above RegEx and use this:

replace(., '(.*?)\|(.*?)\s?\|.*?$', '$1|$2')

Its output is

<root>
    <A>
        <B>First | Second</B>
    </A>
    <A>
        <B>Apple | Orange</B>
    </A>
    <A>
        <B>Example</B>
    </A>
</root>

Upvotes: 1

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243549

This transformation:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="B/text()">
    <xsl:value-of select="string-join(tokenize(., ' \| ')[position() lt 3], ' | ')"/>
  </xsl:template>
</xsl:stylesheet>

When applied on the provided XML (fragment, severely malformed -- now fixed):

<t>
    <A>
        <B>First | Second | Third</B>
    </A>
    <A>
        <B>Apple | Orange | Bananas | Kiwi</B>
    </A>
    <A>
        <B>Example</B>
    </A>
</t>

produces the wanted, correct result:

<t>
   <A>
      <B>First | Second</B>
   </A>
   <A>
      <B>Apple | Orange</B>
   </A>
   <A>
      <B>Example</B>
   </A>
</t>

Update:

In the question there is a conflict between the description and the provided wanted result.

The solution above produces the provided wanted result.

If really only the first token is needed, then the solution is even simpler:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="B/text()">
    <xsl:value-of select="tokenize(., ' \| ')[1]"/>
  </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the same XML document (above), the correct (for this interpretation of the question) result is produced:

<t>
   <A>
      <B>First</B>
   </A>
   <A>
      <B>Apple</B>
   </A>
   <A>
      <B>Example</B>
   </A>
</t>

Upvotes: 1

Related Questions