Grego
Grego

Reputation: 2250

XPath to query multiple selectors

I want to get values and attributes from a selector and then get attributes and values of its children based on a query.

allow me to give an example.

this is the structure

<div class='message'>
   <div>
   <a href='http://www.whatever.com'>Text</a>
   </div>

   <div>
    <img src='image_link.jpg' />
   </div>

</div>

<div class='message'>
   <div>
   <a href='http://www.whatever2.com'>Text2</a>
   </div>

   <div>
    <img src='image_link2.jpg' />
   </div>

</div>

So I would like to make a query to match all of those once.

Something like this:

 //$dom is the DomDocument() set up after loaded HTML with $dom->loadHTML($html);
$dom_xpath = new DOMXpath($dom);
$elements = $dom_xpath->query('//div[@class="message"], //div[@class="message"] //a, //div[@class="message"] //img');

foreach($elements as $ele){
   echo $ele[0]->getAttribute('class'); //it should return 'message'
   echo $ele[1]->getAttribute('href'); //it should return 'http://www.whatever.com' in the 1st loop, and 'http://www.whatever2.com' in the second loop
   echo $ele[2]->getAttribute('src'); //it should return image_link.jpg in the 1st loop and 'image_link2.jpg' in the second loop
}

Is there some way of doing that using multiple xpath selectors like I did in the example? to avoid making queries all the time and save some CPU.

Upvotes: 4

Views: 9938

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243479

Use:

(//div[@class='message'])[$k]//@*

This selects all three attributes that belong to the $k-th div (and any of its descendants) in the document whose class attribute has string value "message"

You can evaluate N such XPath expressions -- for $k from 1 to N, where N is the total count of //div[@class='message']

XSLT - based verification:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:for-each select="//div[@class='message']">
    <xsl:variable name="vPos" select="position()"/>

    <xsl:apply-templates select=
    "(//div[@class='message'])[0+$vPos]//@*"/>
 ================
  </xsl:for-each>
 </xsl:template>

 <xsl:template match="@*">
  <xsl:value-of select=
  "concat('name = ', name(), ' value = ', ., '&#xA;')"/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document (wrapped in a single top element to become well-formed):

<html>
    <div class='message'>
        <div>
            <a href='http://www.whatever.com'>Text</a>
        </div>
        <div>
            <img src='image_link.jpg' />
        </div>
    </div>
    <div class='message'>
        <div>
            <a href='http://www.whatever2.com'>Text2</a>
        </div>
        <div>
            <img src='image_link2.jpg' />
        </div>
    </div>
</html>

The XPath expression is evaluated twice and the selected attributes are formatted and output:

name = class value = message
name = href value = http://www.whatever.com
name = src value = image_link.jpg

 ================
name = class value = message
name = href value = http://www.whatever2.com
name = src value = image_link2.jpg

 ================

Upvotes: 3

Wayne
Wayne

Reputation: 60414

Use the union operator (|) in a single expression like this:

//div[@class="message"]|//div[@class="message"]//a|//div[@class="message"]//img

Note that this will return a flattened result set (so to speak). In other words, you won't access the elements in groups of three like your example shows. Instead, you'll just iterate everything the expressions matched (in document order). For this reason, it might be even smarter to simply iterate the nodes returned by //div[@class="message"] and use DOM methods to access their children (for the other elements).

Upvotes: 8

Related Questions