Reputation: 34062
I have the following HTML:
$page = '<html>
<head>
<title>Page</title>
</head>
<body>
<div>
<div>
<div>
</div>
<div class="this one">
<h2>Ignore</h2>
<p>Text</p>
<h2>Header 1</h2>
<ul><li>List Value 1</li></ul>
<h2>Header 2</h2>
<ul><li>List Value 2</li></ul>
<h2>Ignore</h2>
<ul><li>List Value 3</li></ul>
<h2>Header 3</h2>
<ul>
<li>List Value A</li>
<li>List Value B</li>
<li>List Value C</li>
</ul>
<h2>Ignore</h2>
<p>Text</p>
</div>
</div>
</div>
</body>
</html>';
I am trying to get the li
list for Header 3
only and the following code doesn't work;
$doc->loadHTML($page);
$xpath = new DomXPath($doc);
$nodes = $xpath->query("//div[@class='this one']/h2[.='Header 3']/ul/li");
foreach($nodes as $node) {
echo $node->nodeValue . "<br />";
}
I am expecting the output:
List Value A<br />
List Value B<br />
List Value C<br />
Upvotes: 1
Views: 251
Reputation: 120714
This is the expression that you want:
//div[@class = 'this one']/h2[text() = 'Header 3']/following-sibling::ul[1]/li
Broken down a bit:
//div[@class = 'this one']
- Match all <div>
s in the document with the specified class
attribute value
…/h2[text() = 'Header 3']
- Match all <h2>
s that are children of those <div>
s that have the specified text content
…/following-sibling::ul
- Use the following-sibling
axis to match <ul>
s that appear after the <h2>
s
…[1]
- Match only the first <ul>
that is a sibling of the matched <h2>
(… remembering that indexes are 1-based in XPath expressions)
…/li
- And match all of the <li>
s which are children of that <ul>
Upvotes: 3