Reputation: 913
Working with a DOM
that has the same HTML loop 100+ times that looks like this
<div class="intro">
<div class="header">
<h1 class="product-code"> <span class="code">ZY001</span> <span class="intro">ZY001 Title/Intro</span> </h1>
</div>
<div>
<table>
<tbody>
<tr>
<td>Available</td>
<td> S </td>
<td> M </td>
<td> XL </td>
</tr>
I was previously using this XPath Query to get ALL the node values back (all 100+ instances of the DOM Query in connection with the variable nodes that may contain in Available
//div[@class='intro']/div/table/tbody/tr/td[contains(text(),'Available')]/following-sibling::td
object(DOMNodeList)[595] public 'length' => int 591
Now I am needing to target the product-code
/ code
specifically to retrieve all the td
attributes for a particular code
Because the div that contains the unique identifier (in the example above, ZY001
) is not a direct ancestor, my thinking is I have to do a Reverse XPath Query
Here's one of my attempts:
//h1[@class='product-code']/span[contains(@class, 'code') and text() = 'ZY001']/../../div[@class='intro']/div/table/tbody/tr/td[contains(text(),'Available')]/following-sibling::td
As I am defining /span[contains(@class, 'code') and text() = 'ZY001']
and then attempting to traverse the dom backwards twice using /../../
I was hoping/expecting to get back the div[@class='intro'] with the text ZY001
immediately above it, or rather a public 'length' => int 1
But all my attempts thus far have resulted in 0
results. Not false
, indicating an improper XPath, but 0
.
How can I modify my XPath Query to get back the single instance in the one-of-many <div class="intro">
's that contain the <h1 class="product-code">
/<span class="code">
text value ZY001
?
Upvotes: 2
Views: 2329
Reputation: 5078
You can use any of the below xpath's for that:
//div[@class='intro' and //h1[@class='product-code']/span[@class='code' and text()='ZY001']]//tbody/tr[td[text()='Available']]/td[2]
//div[@class='intro' and //span[@class='code' and text()='ZY001']]//tbody/tr[td[text()='Available']]/td[2]
//div[@class='intro' and //span[@class='code' and text()='ZY001']]//tr[td[text()='Available']]/td[2]
Change td[2]
to td[3]
and td[4]
to get the 3rd and 4th td
respectively
Upvotes: 1
Reputation: 375
Use
//h1[@class='product-code']/span[contains(@class, 'code') and text() = 'ZY001']/../../../div/table/tbody
instead of
//h1[@class='product-code']/span[contains(@class, 'code') and text() = 'ZY001']/../../div[@class='intro']/div/table/tbody
Upvotes: 1