user1300244
user1300244

Reputation: 343

XPath Selecting Nodes Until Condition

I have a HTML/XML document similar to the following. There can be one or more 'tr' of the same colour before switching to the other colour in an arbitrarily repeating pattern. This is an example:

<tr class='red'></tr>
<tr class='blue'></tr>
<tr class='red'></tr>
<tr class='red'></tr>
<tr class='red'></tr>
<tr class='blue'></tr>
<tr class='blue'></tr>
<tr class='red'></tr>
<tr class='red'></tr>
<tr class='blue'></tr>

What I am looking for is an XPath (1.0) expression which, starting from the first 'tr' in any colour 'block' (note that there is no markup indicating these blocks, only alterations in the colour), selects the following subsequent 'tr's within that block only.

I have tried the following expression

./following-sibling::tr[@class=preceding-sibling::tr[1]/@class]

but this also selects the second+ 'tr's of subsequent blocks. I feel like I'm close to what I need, but can't quite manage it.

Thanks in advance.

Edit: The desired output is a nodeset containing the subsequent 'tr's within the block (and only that block).

Upvotes: 3

Views: 1339

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243459

This XPath 1.0 expression selects the first "block" of blue tr elements:

      (/*/tr[@class='blue'][1] | /*/tr[@class='blue'][1]/following-sibling::tr)
        [count(. | /*/tr[@class='blue'][1]
                          /following-sibling::tr
                                    [not(@class='blue')][1]
                                       /preceding-sibling::*
               )
        =
         count(/*/tr[@class='blue'][1]
                          /following-sibling::tr
                                    [not(@class='blue')][1]
                                       /preceding-sibling::*
         )
         ]

Explanation:

Using the wellknown Kayessian formula for node-set intersection:

$ns1[count(.|$ns2) = count($ns2)]

This XPath expression selects exactly the nodes that belong to both the node-set $ns1 and the node-set $ns2.

In this particular case we simply substitute $ns1 and $ns2 with their appropriate specific XPath expressions -- one is the first blue tr and all of its following siblings, the other is the first non-blue tr following the first blue tr and all of its preceding siblings. The intersection of these two node-sets is exactly the wanted first block of blue trs.

XSLT - based verification:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
  <xsl:copy-of select=
  "(/*/tr[@class='blue'][1] | /*/tr[@class='blue'][1]/following-sibling::tr)
            [count(. | /*/tr[@class='blue'][1]
                              /following-sibling::tr
                                        [not(@class='blue')][1]
                                           /preceding-sibling::*
                   )
            =
             count(/*/tr[@class='blue'][1]
                              /following-sibling::tr
                                        [not(@class='blue')][1]
                                           /preceding-sibling::*
                 )
             ]
  "/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the following XML document:

<t>
    <tr class='red'></tr>
    <tr class='red'></tr>
    <tr class='red'></tr>
    <tr class='red'></tr>
    <tr class='blue'></tr>
    <tr class='blue'></tr>
    <tr class='red'></tr>
    <tr class='red'></tr>
    <tr class='blue'></tr>
</t>

the XPath expression is evaluated and the selected nodes are copied to the output:

<tr class="blue"/>
<tr class="blue"/>

Upvotes: 3

Michael Kay
Michael Kay

Reputation: 163262

If you have a variable $v bound to the starting node then I think it can be done (with horrendous inefficiency) like this:

$v/following-sibling::tr[@class = $v/@class and count(preceding-sibling::tr[not(@class=$v/@class)] = count($v/preceding-sibling::tr[not(@class=$v/@class)])]

If your API doesn't give you the opportunity to bind a variable, then I don't think it can be done, though I'm willing to be proved wrong.

You haven't said what your constraints are, but XPath 1.0 doesn't seem a good choice of technology for this particular problem.

Even in XPath 2.0 it's not particularly nice. You really need recursion, and that implies using XQuery or XSLT rather than pure XPath.

Upvotes: 0

Related Questions