Reputation: 1749
I need to scrape this HTML page ...
http://www.asl1.liguria.it/templateProntoSoccorso.asp
.... using PHP and XPath to get the values like 2 in
Codice bianco: 2
(NOTE: you could see different values in that page if you try to browse it ... it doesn't matter ..,, they changing dinamically .... )
I can't get the XPath for those values using Mozilla Firebug as usually I do: any suggestions?
Thank you in advance!
UPDATE
<?php
ini_set('display_errors', 1);
$url = 'http://www.asl1.liguria.it/templateProntoSoccorso.asp';
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_PROXY, '');
$data = curl_exec($ch);
curl_close($ch);
$dom = new DOMDocument();
@$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$Number = $xpath->query('/html/body/table/tbody/tr/td[2]/table[2]/tbody/tr/td[3]/table/tbody/tr[2]/td[1]/table/tbody/tr/td/div[1]/div[3]/div[2]');
foreach( $Number as $node )
{
echo "Number: " .$node->nodeValue;
echo '<br>';
echo '<br>';
}
?>
Upvotes: 0
Views: 345
Reputation: 1749
I've solved ... here you are the right code
<?php
ini_set('display_errors', 1);
$url = 'http://www.asl1.liguria.it/templateProntoSoccorso.asp';
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_PROXY, '');
$data = curl_exec($ch);
curl_close($ch);
$dom = new DOMDocument();
@$dom->loadHTML($data);
$xpath = new DOMXPath($dom);
$Number = $xpath->query('(//div[@class="datiOspedaleCodici"]/div[1]/text())[1]');
foreach( $Number as $node )
{
echo "Number: " .$node->nodeValue;
echo '<br>';
echo '<br>';
}
?>
that print ....
Codice bianco: 2
Upvotes: 0
Reputation: 52665
This should work:
Value from First element:
substring-after(//div[@class="datiOspedaleCodici"]/div[1]/text(), ":")
From second:
substring-after(//div[@class="datiOspedaleCodici"]/div[2]/text(), ":")
...etc
Just increase index in /div[x]
to get next value
Upvotes: 1