Reputation: 11
I have the following content:
<div class="item">
<a href="ONE">
<img src="TWO">
</a>
</div>
I want to use XPath to pull out "ONE" and "TWO" from there.
The code I have right now is:
$html = file_get_contents($_POST['url']);
$document = new DOMDocument();
$document->loadHTML ($html);
$selector = new DOMXPath($document);
$query = '//div[@class="item"]';
$anchors = $selector->query($query);
foreach ($anchors as $node) {
// print ONE;
// print TWO;
}
Upvotes: 0
Views: 26
Reputation: 158060
Here comes an example:
$html = <<<EOF
<div class="item">
<a href="ONE">
<img src="TWO">
</a>
</div>
EOF;
$doc = new DOMDocument();
$doc->loadHTML($html);
$selector = new DOMXPath($doc);
$links = $selector->query(
'//div[@class="item"]//@href | //div[@class="item"]//@src'
);
foreach($links as $link) {
echo $link->nodeValue . PHP_EOL;
}
If you want to break it down by <div class="item">
you can use the following code:
foreach($selector->query('//div[@class="item"]') as $div) {
foreach($selector->query('.//@href | .//@src', $div) as $link) {
echo $link->nodeValue . PHP_EOL;
}
}
Upvotes: 1