Reputation: 183
I'm trying to parse an HTML Table with XPath. The URL is: click here.
I use FireBug to see page's DOM and i understand the container i need.
<tbody>
<tr class="r1">
<td class="l rbrd">
<img class="spr2 sport sp1" align="absmiddle" src="/s.gif">
</td>
<td class="l rbrd">19/4 18:30</td>
<td class="l rbrd">
<a title="CHELSEA FC - SUNDERLAND" href="/chelsea-fc-vs-sunderland/e/4509648/" target="_blank">CHELSEA FC - SUNDERLAND</a>
</td>
<td class="c w40">
<span class="o">1,21</span>
<span class="p">92,8%</span>
</td>
<td class="c w10 rbrd">
<span class="o">
<span class="p">
</td>
<td class="c w40">
<span class="o">8,00</span>
<span class="p">4,7%</span>
</td>
<td class="c w10 rbrd">
<span class="o">
<span class="p">
</td>
<td class="c w40">
<span class="o">18,00</span>
<span class="p">2,5%</span>
</td>
<td class="c w10 rbrd">
<span class="o">
<span class="p">
</td>
<td class="c emph">
<span class="o">353.660 €</span>
</td>
<td class="c w10 emph rbrd">
<img class="imgdiff" width="10" height="10" src="http://img.oxytropis.com/s.gif">
</td>
<td class="c rbrd">
<span class="o">1,56</span>
<span class="p">67,5%</span>
</td>
<td class="c rbrd">
<span class="o">2,74</span>
<span class="p">32,5%</span>
</td>
<td class="c emph rbrd">
<span class="o">6.243 €</span>
</td>
<td class="c rbrd">
<a onclick="_gaq.push(['_trackEvent','betfair','click','tziroi-out']);" href="http://sports.betfair.com/Index.do?mi=&ex=1&origin=MRL&rfr=655" rel="nofollow" target="_blank">
</td>
</tr>
This is only one row, there are hundreds more. So we have all rows with informations and we can check every single line and check whether it contains date, match, money etc ... i need to make a condition for each of them, to store all of them in an array.
I follow this tutorial: click here
Wich condition i can use to differentiate each cells from another?
I want to have something like this for each rows in the table:
[0] => Array
(
[date] => 18:30 19/4
[teams] => CHELSEA FC - SUNDERLAND
[1] => 1,21
[1 volumes] => 92,8%
[X] => 8,00
[X volumes] => 4,7%
[2] => 18,00
[2 volumes] => 2,5%
[matched] => 353.660 €
...
)
This is the php, i'm blocked at this point:
<?php
$curl = curl_init('http://www.oxybet.ro/pariu/external/betfair-volumes.htm');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10');
$html = curl_exec($curl);
curl_close($curl);
if (!$html) {
die("something's wrong!");
}
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$scores = array();
$tableRows = $xpath->query('//div//div//div[2]//div/div//table//tr');
foreach ($tableRows as $row) {
// fetch all 'tds' inside this 'tr'
$td = $xpath->query('td', $row);
$match = array();
Upvotes: 0
Views: 1036
Reputation: 38732
Your query is fetching all table rows so far. In the next step, loop over these results (in PHP) and access the rows as needed. You might either want to use direct DOM access or XPath, whatever you prefer.
For using XPath, use an XPath expression that starts querying at the current context, and pass the current row as such. Use numerical predicates to limit to the row you're looking for. For example, to query the team name (in the third table cell, XPath counts 1-indexed), use something like
$tableRows = $xpath->query('//div//div//div[2]//div/div//table//tr');
foreach ($tableRows as $row) {
$team = $xpath->query('./td[3]/a', $row)->item(0)->textContent;
}
Querying the class attributes might also be possible, but they seem to be used rather arbitrarily.
Now, read the other table rows with similar queries, construct the resulting map and append it to the $scores
array.
Upvotes: 1