Björn
Björn

Reputation: 1

XSLT problem: regular expressions for attribute values

Consider the following XML:

<?xml-stylesheet type="text/xsl" href="eclas.xsl"?>

  <collection>
    <record>
      <datafield tag="150">
        <subfield code="a">Abandon des études</subfield><!--accepted FR-->
        <subfield code="9">fre</subfield>
      </datafield>
      <datafield tag="150">
        <subfield code="a">Student drop-out</subfield><!--accepted EN-->
        <subfield code="9">eng</subfield>
      </datafield>
      <datafield tag="450">
        <subfield code="a">Décrochage scolaire</subfield><!-- NOT accepted term FR-->
        <subfield code="9">fre</subfield>
      </datafield>
      <datafield tag="450">
        <subfield code="a">Abandon scolaire</subfield><!-- NOT accepted term FR-->
        <subfield code="9">fre</subfield>
      </datafield>
      <datafield tag="450">
        <subfield code="a">Abandon de la scolarité</subfield><!-- NOT preferred term FR-->
        <subfield code="9">fre</subfield>
      </datafield>
    </record>
    <record>
      <datafield tag="151">
        <subfield code="a">Egypte</subfield>
        <subfield code="9">fre</subfield>
      </datafield>
      <datafield tag="151">
        <subfield code="a">Egypt</subfield>
        <subfield code="9">eng</subfield>
      </datafield>
      <datafield tag="451">
        <subfield code="a">République arabe d&apos;Egypte</subfield>
        <subfield code="9">fre</subfield>
      </datafield>
      <datafield tag="451">
        <subfield code="a">République arabe unie</subfield>
        <subfield code="9">fre</subfield>
      </datafield>
      <datafield tag="451">
        <subfield code="a">United Arab Republic</subfield>
        <subfield code="9">eng</subfield>
      </datafield>
    </record>
</collection>

It's a sample from a large thesaurus. I need help with my regular expression that can either select between 150 or 151 and 450 or 451.

Here's the xslt code I have trouble with:

<xsl:for-each select="datafield[contains(@tag, '150|151' )]">
...
</xsl:for-each>

I'm trying to loop over the datafield elements that have either 150 or 151 as value. My regular expression does not seem to work. I have tried several things to no avail.

Upvotes: 0

Views: 256

Answers (3)

Valdi_Bo
Valdi_Bo

Reputation: 31011

You want to match either of the following 4 strings: 150, 151, 450 and 451. Note that:

  • the first char is either 1 or 4,
  • the second char is always 5,
  • and the last char is either 0 or 1.

So the regex matching all of them is ^[14]5[01]$.

I put ^ and $ anchors to prevent from matching such a string as a part of longer text (e.g. 31508).

So in XSLT 2.0 you can write:

<xsl:for-each select="datafield[matches(@tag, '^[14]5[01]$')]">

Upvotes: 0

JGNI
JGNI

Reputation: 4013

contains() takes a string not a regex as the second parameter so your code is looking for the string 150|151. You can't do regex in XSLT 1.0. However using the choose() function you can do multiple contains(). See this question for more info.

Upvotes: 1

Tim C
Tim C

Reputation: 70648

The contains function does not take a regular expression as the second argument, just a simple string which it checks is in the first string or not. You should be using matches...

 <xsl:for-each select="datafield[matches(@tag, '^150$|^151$')]">

Or slightly better...

<xsl:for-each select="datafield[matches(@tag, '^(150|151)$')]">

Note the extra symbols to prevent "1500" being matched, for example.

However, matches is XSLT 2.0 only, and your use of <?xml-stylesheet suggests you are doing the transformation in the browser, which would really be XSLT 1.0 only. If this is the case, then you can use contains with a little extra effort

<xsl:for-each select="datafield[contains('|150|151|', concat('|', @tag, '|') )]">

Again, the extra use of | is to prevent 1500 being picked up, for example.

Upvotes: 0

Related Questions