Reputation: 1749
I need to scrape this HTML page ...
http://www1.usl3.toscana.it/default.asp?page=ps&ospedale=3
.... using PHP and XPath to get the value 7 near the string "CODICE GIALLO"
(NOTE: you could see different value in that page if you try to browse it ... it doesn't matter ..,, it change dinamically .... )
I'm using this PHP code sample to print the value ...
<?php
ini_set('display_errors', 'On');
error_reporting(E_ALL);
$url = 'http://www1.usl3.toscana.it/default.asp?page=ps&ospedale=3';
$xpath_for_parsing = '/html/body/div/div[2]/table[2]/tbody/tr[1]/td/table/tbody/tr[3]/td[2]/table/tbody/tr[4]/td[2]/table/tbody/tr[2]/td[2]/b';
//#Set CURL parameters: pay attention to the PROXY config !!!!
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_PROXY, '');
$data = curl_exec($ch);
curl_close($ch);
$dom = new DOMDocument();
@$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$colorWaitingNumber = $xpath->query($xpath_for_parsing);
$theValue = 'N.D.';
foreach( $colorWaitingNumber as $node )
{
$theValue = $node->nodeValue;
}
print $theValue;
?>
In this way I obtain "N.D." as output not "7" as I suppose.
Reading this Why does my XPath query (scraping HTML tables) only work in Firebug, but not the application I'm developing? I've seen that the problem coud be about the <tbody>
tag so I've tried to eliminate it form my original xpath and I tried my code using:
$xpath_for_parsing = '/html/body/div/div[2]/table[2]/tr[1]/td/table/tr[3]/td[2]/table/tr[4]/td[2]/table/tr[2]/td[2]/b'
but the result is still "N.D." instead of "7".
Using
$xpath_for_parsing = '/html/body/div/div[2]/table[2]/tr[1]/td/table/tr[3]/td[2]/table/tr[4]/td[2]/table'
the result is "Codice GIALLO 7"
How may I obtain only the "7" value?
Any suggestions / example?
Upvotes: 0
Views: 159
Reputation: 52665
This one should do the trick:
//td[.="Codice GIALLO"]/following-sibling::td/b
Upvotes: 1