Reputation: 433
I am trying to get the total yearly value of solar irradiation and other values from a table I get with curl from European pv_gis.
The table I get is:
<table class=data_table border="1" width="300" >
<tr> <td> Jan </td><td align="right">2.27</td><td align="right">70.3</td><td align="right">2.86</td><td align="right">88.5</td></tr>
<tr> <td> Feb </td><td align="right">2.79</td><td align="right">78.0</td><td align="right">3.56</td><td align="right">99.7</td></tr>
<tr> <td> Mar </td><td align="right">3.59</td><td align="right">111</td><td align="right">4.74</td><td align="right">147</td></tr>
<tr> <td> Apr </td><td align="right">4.23</td><td align="right">127</td><td align="right">5.68</td><td align="right">171</td></tr>
<tr> <td> May </td><td align="right">4.46</td><td align="right">138</td><td align="right">6.13</td><td align="right">190</td></tr>
<tr> <td> Jun </td><td align="right">4.53</td><td align="right">136</td><td align="right">6.38</td><td align="right">191</td></tr>
<tr> <td> Jul </td><td align="right">4.74</td><td align="right">147</td><td align="right">6.70</td><td align="right">208</td></tr>
<tr> <td> Aug </td><td align="right">4.59</td><td align="right">142</td><td align="right">6.53</td><td align="right">202</td></tr>
<tr> <td> Sep </td><td align="right">4.32</td><td align="right">130</td><td align="right">5.96</td><td align="right">179</td></tr>
<tr> <td> Oct </td><td align="right">3.63</td><td align="right">113</td><td align="right">4.87</td><td align="right">151</td></tr>
<tr> <td> Nov </td><td align="right">2.64</td><td align="right">79.1</td><td align="right">3.41</td><td align="right">102</td></tr>
<tr> <td> Dec </td><td align="right">2.15</td><td align="right">66.5</td><td align="right">2.72</td><td align="right">84.3</td></tr>
<tr><td colspan=5> </td></tr>
<tr><td><b> Yearly average </b></td><td align="right"><b>3.67 </b></td><td align="right"><b>111 </b></td></td><td align="right"><b>4.97 </b></td><td align="right"><b>151 </b></td></tr>
<tr><td><b>Total for year</b></td><td align="right" colspan=2 ><b> 1340 </b> </td> <td align="right" colspan=2 ><b> 1810 </b> </td> </tr>
</table>
As you can see, the Total values are contained in the last tag of that table. Specifically, the total yearly value is in the second tag.
Now, I have tried to use txt2reg tools to build a regular expression, but with success, as I don't know how to target the last row of the above mentioned table.
I get infinite string of numbers, by deleting all TR and TD, but at that point, numbers get confused.
Do you guys have some suggestions?
Thank you very much.
EDIT
I did the following, but I get an error. The error is:
Catchable fatal error: Argument 1 passed to DOMXPath::__construct() must be an instance of DOMDocument, instance of DOMElement given in C:\Users\test\www2\test_pvgis.php on line 49
And the code is:
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($varResponse);
$table = $doc->getElementsByTagName('table')->item(1);
print_r($table);
$xpath = new DOMXpath($table);
$lastRow = $xpath->query("(//tr)[last()]");
// look for td elements inside the last row we isolated above
// path for td elements is relative
$cells = $xpath->query('./td',$lastRow[0]);
// you can also store the values for later use
foreach($cells as $key=>$cell){
//we are ignoring the first key, since it holds the "Total for year" bit
if ($key != 0){
$store[] = trim($cell->nodeValue); // trim out the leading and trailing spaces
}
}
print_r($store);
The error is located here: $xpath = new DOMXpath($table); but I have to idea why. Any clue?
Upvotes: 1
Views: 115
Reputation: 7283
Edit
Assuming you have more tables and the first one is the relevant one.
You need to pass a DOMDocument
instance to the DOMXpath
constructor.
So you will use the $doc
for $xpath = new DOMXpath($doc);
And when you query
for the last row, you pass as second parameter the $table
element
Here's an example using DOMDocument
and DOMXpath
// start edit
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($varResponse);
$table = $doc->getElementsByTagName('table')->item(1);
print_r($table);
$xpath = new DOMXpath($doc);
$lastRow = $xpath->query("(./tr)[last()]",$table);
// end edit
// look for td elements inside the last row we isolated above
// path for td elements is relative
$cells = $xpath->query('./td',$lastRow->item(0)); // fixed 'Cannot use object of type DOMNodeList as array i'
// you can also store the values for later use
foreach($cells as $key=>$cell){
//we are ignoring the first key, since it holds the "Total for year" bit
if ($key != 0){
$store[] = trim($cell->nodeValue); // trim out the leading and trailing spaces
}
}
print_r($store);
/*
ouputs
Array
(
[0] => 1340
[1] => 1810
)
*/
Upvotes: 2